Machine learning analysis suggests that there are four sub-phenotypes of long COVID

Four sub-phenotypes of long COVID :

Researchers recently found different sub-phenotypes of PASC [post-acute sequelae of coronavirus disease 2019 (COVID-19)] based on illnesses detected between 1 to 3 months of an acute infection by severe acute respiratory syndrome coronavirus 2 in a paper published in Nature Medicine (SARS-CoV-2).


Studies have looked at PASC conditions independently without demonstrating that other conditions co-occurred. The degree to which PASC illnesses and symptoms are co-incident or disproportionately developed among particular patients, known as sun-phenotypes or co-incident patterns, may help to disclose PASC pathogenesis.

Concerning the study

In the current work, researchers used a data-driven methodology based on machine learning to identify PASC sub-phenotypes.

The INSIGHT CRN and the OneFlorida+ CRN are two significant CRNs (clinical research networks) that are part of the national PCORnet (patient-centered CRN). The OneFlorida+ CRN has 19 million people living in Georgia, Alabama, and Georgia compared to the INSIGHT CRN’s 12 million population of NYC (New York City).

The developmental cohort (n=20,881) and validation cohort (n=13,724) were made up of OneFlorida+ CRN and INSIGHT participants, respectively. Participants in the study who tested positive for SARS-CoV-2 had their conditions evaluated between 30 and 180 days after a reported COVID-19 diagnosis.

Between March 2020 and November 2021, positive SARS-CoV-2 antigen test or nucleic acid amplification test findings were used to diagnose COVID-19. The prevalence of 137 possible PASC condition CCSR (clinical classifications software refined) categories, as identified by the ICD-10 codes, was evaluated.

Depending on which PASC sub-phenotypes were identified, the TM (topic modelling) approach was utilised to find co-incident patterns of the PASC circumstances. Following the acquisition of high-dimensional binary representations of PASC conditions (step 1), the algorithm learned PASC topics (T) and, using a topic-modelling technique, inferred patient representations in the low-dimensional PASC topic space (step 3). Based on patient clusters that corresponded to PASC subjects, PASC sub-phenotypes were identified (step 4).

Based on the produced heat maps, the PASC co-incidence patterns of SARS-CoV-2 positive and SARS-CoV-2 negative individuals were compared, and the entropy of each issue vector was determined. Based on modifications to the propensity score (PS), the robustness of the identified PASC sub-phenotypes was assessed. The group also quantitatively compared the subjects. Quantitative evaluations were performed on the initial set of topics learned from the 137 PASC conditions with cosine similarity and related topics learned from the two CRN cohorts.


There are four distinct PASC subphenotypes. Sub-phenotype 1 included 7,047 patients (34%) and was predominately renal, circulatory, and cardiac illnesses (T-3, 8, 10), including kidney failure, circulatory and cardiac problems, fluid and electrolyte imbalance, and so on. Patients’ average age was 65, and 49% of them were male. The patients had high acute COVID-19 severity, necessitating hospitalisation (61%), mechanical ventilation (5%) and admittance into critical care (10%).

During the initial COVID-19 wave, the sub-phenotype had the highest proportion of SARS-CoV-2-positive patients (37%). (between March and June 2020). The sub-phenotype patients were primarily treated for anaemia, circulatory problems, and endocrine problems despite having a high burden of comorbidities.

The sleep, anxiety, and respiratory issues dominated sub-phenotype 2. The sub-phenotype included 6,838 (33%) patients, and the most common symptoms were chest discomfort, headaches, anxiety, sleep disturbances, and pulmonary illnesses (T-4,7,9). The patients were 31% acute COVID-19 hospitalizations, 63% female, and had a median age of 51.

Between November 2020 and November 2021, 65% of patients with COVID-19 were diagnosed with the sub-phenotype. Most commonly, anti-allergy, anti-inflammatory, and anti-asthma drugs such inhaled steroids, montelukast, and levalbuterol were prescribed to sub-phenotype 2 people.

23% (n=4,879) of people with nervous system and musculoskeletal illnesses (T-1,5,6), such as headaches, sleep issues, and musculoskeletal discomfort, fell into sub-phenotype 3. 61% of the patients were female, with a median age of 57. The sub-phenotype was made up primarily (78%) of people who had >5.0 outpatient visits prior to COVID-19. The majority of the analgesic drugs provided to the sub-phenotype patients were (such as ketorolac and ibuprofen).

10% (n=2,117) of people with mostly respiratory and digestive illnesses belonged to sub-phenotype 4. (T-2, 4, 8). The median patient age was 54 years, and 62% of the patients were female. During acute COVID-19, the lowest rates of mechanical ventilator use (one percent) and critical care unit hospitalizations (three percent) were seen, as well as the highest rates of no emergency department visits (57.0%). Medications for digestive system disorders were typically provided to sub-phenotype persons.

Compared to SARS-CoV-2 positive patients, the themes learned from SARS-CoV-2 negative people had higher entropy values. The robustness of the PASC sub-phenotype classification was validated by cosine similarity findings, and the patterns of co-incidence seen for the two CRN cohorts were comparable for people who were SARS-CoV-2 positive. On the other hand, the lessons acquired from SARS-CoV-2-positive people with lower concentration patterns were different from those from uninfected people.


The study’s overall conclusions emphasised four machine learning–identified reproducible data–driven PASC sub-phenotypes. The results might help health officials handle PASCs better.

Next Post

Kedrion Expands Its System Of Plasma Gathering Facilities

The acquisition of the Czech firms UNICAplasma sro and UNICAplasma Morava sro, which run five plasma collection centres in the Czech Republic, was announced by Kedrion, an Italian global biopharmaceutical company that collects and fractionates blood plasma to make and distribute plasma-derived drugs for the treatment of serious diseases and […]