Real-World Data. Real-World Depth.

Explore the scale of our ambulatory dataset across key therapeutic areas. From top diagnosis codes to longitudinal patient journeys, get the evidence you need to drive discovery.

Primary Care & Family Medicine 

Patient Reach: 12M+ Longitudinal Records

Top Diagnosis Codes:

 I10 – Essential (primary) hypertension 
E11.9 – Type 2 diabetes mellitus without complications 
E78.5 – Hyperlipidemia, unspecified 

Data Highlight: Decades of wellness visit trends and chronic disease management tracking. 

Cardiology

Patient Reach: 1.6M+ Specialty Encounters

Top Diagnosis Codes:

I48.0 – Paroxysmal atrial fibrillation 
I25.10 – ASHD of native coronary artery 
I50.9 – Heart failure, unspecified  

Data Highlight: Rich integration of ECG results, vitals history, and cardiac-specific medication adherence. 

 

OBGYN & Women’s Health 

Patient Reach: 4.3M+ Active Patients 

Top Diagnosis Codes:

Z34.90 – Supervision of normal pregnancy, unspecified trimester 
N80.0 – Endometriosis of uterus 
N95.1 – Menopausal and female climacteric states 

Data Highlight: Comprehensive maternal health tracking and longitudinal reproductive history. 

Pediatrics

Patient Reach: 1.6M+ Pediatric Patients 

Top Diagnosis Codes:

J06.9 – Acute upper respiratory infection, unspecified 
J30.9 – Allergic rhinitis, unspecified 
J45.909 – Unspecified asthma, uncomplicated 

Data Highlight: Growth chart tracking, immunization records, and pediatric-specific dosing trends. 

Gastroenterology

Patient Reach: 600k+ Specialized Records 

Top Diagnosis Codes:

K21.9 – GERD without esophagitis 
K58.0 – IBS with diarrhea 
K50.90 – Crohn's disease, unspecified, without complications 

Data Highlight: Rich integration of ECG results, vitals history, and cardiac-specific medication adherence.                                               

Oncology

Patient Reach: 83k+ Active Patients 

Top Diagnosis Codes:

C50 – Malignant neoplasm of breast
C61 – Malignant neoplasm of prostate
C73 – Malignant neoplasm of thyroid gland

Data Highlight: Comprehensive oncology treatment records, staging insights, biomarker data, and therapy-specific medication adherence trends.

Beyond the Diagnosis

Our data goes deeper than ICD-10 codes. Every specialty dataset includes access to:  
clinical notes

Clinical Notes

De-identified SOAP notes for deep-context NLP.

lab results

Lab Results

Standardized LOINC codes and longitudinal lab values.

medicine

Medications

Prescribing patterns, fill data, and therapeutic switches.

vitals

Vitals

MI, Blood Pressure, and Heart Rate trends over time.

Sidus data count 4-01

Where Does Our Data Come From?

 

 Frequently Asked Questions 

How is Sidus Insights data standardized for research use?

All data ingested into the Sidus platform is harmonized to a single proprietary curation standard regardless of its originating EHR or RCM system. Lab results are standardized to LOINC codes, diagnoses to ICD-10, and medications to RxNorm, ensuring interoperability with analytical tools and consistency across specialties. This pre-standardization allows researchers to begin analysis immediately without custom data cleaning pipelines. 

Can Sidus Insights data be used for natural language processing (NLP) research?

Yes. Sidus Insights provides access to de-identified SOAP notes and unstructured clinical data that support NLP research, including clinical phenotyping, symptom extraction, and treatment pathway analysis. With approximately 165 million unstructured files, Sidus offers one of the largest ambulatory clinical text datasets in the U.S. 

What is longitudinal patient data and why does it matter for research?

 Longitudinal patient data tracks a patient's healthcare journey over time across multiple visits and care settings. It helps researchers study disease progression, treatment patterns, and outcomes, supporting more robust clinical and outcomes research.  

How can a researcher access Sidus Insights data for a specific therapeutic area?

 Researchers can contact Sidus Insights to discuss their data requirements. Sidus provides customized datasets and cohorts tailored to specific therapeutic areas, study criteria, and research objectives. 

How many longitudinal patient records does Sidus Insights hold for Primary Care?

Sidus Insights holds more than 12 million longitudinal records in its Primary Care and Family Medicine dataset. This includes decades of wellness visit trends, chronic disease management data, and multi-year follow-up records covering conditions such as hypertension (ICD-10: I10), Type 2 diabetes mellitus (E11.9), and hyperlipidemia (E78.5).