Information has been essential to new developments and improved organisation. We can organise ourselves more effectively to produce the best results the more knowledge we have. Data collecting is crucial for every organisation because of this. This information can also be used to forecast upcoming occurrences and present trends in a number of aspects. We have started producing and collecting more data about nearly everything by bringing technical advancements in this direction as we are getting more and more aware of this. Currently, we are in a scenario where we are constantly being bombarded with data from all facets of our lives, including social interactions, science, our jobs, our health, etc. We can liken the current state of affairs to a data deluge.
Our ability to generate ever-increasing amounts of data—to the point that it is now beyond the capabilities of currently existing technologies—has been made possible by technical advancements As a result, the term “big data” was developed to refer to vast and impractical amounts of data. We need to create new methods for organising this data so that we may extract useful information in order to meet our current and future social needs. Healthcare is one such unique societal need. Healthcare firms, like many other industries, are producing data at a great velocity that brings both many benefits and difficulties at the same time. The fundamentals of big data, including its management, analysis, and future possibilities, particularly in the healthcare sector, are covered in this overview.
The quantity of data
People who work for different organisations all over the world produce a tonne of data every day. Quantitatively speaking, the phrase “digital universe” describes the enormous amounts of data that are produced, copied, and used in a single year. The approximate size of the digital world in 2005, according to International Data Corporation (IDC), was 130 exabytes (EB). In 2017, the size of the digital world reached around 16,000 EB (16 zettabytes) (ZB). By 2020, the digital universe, according to IDC’s forecast, will reach 40,000 EB. To get an idea of this scale, each person would need to have access to around 5200 gigabytes (GB) of data. This is a prime example of the extraordinary rate of expansion of the digital cosmos. The internet giants, like Google and Facebook, have been collecting and storing massive amounts of data.
For instance, based on our selections, Google may retain a range of data, such as user location, ad preferences, a list of programmes used, internet browser history, contacts, bookmarks, emails, and other essential user data. A similar amount of user-generated data—more than 30 petabytes (PB)—is stored and analysed by Facebook. Big data is made up of such vast volumes of data. The IT sector has effectively leveraged big data over the past ten years to produce vital information that has the potential to yield large profits.
These findings have drawn so much attention that a new branch of science known as “Data Science” has finally emerged as a result. Data management and analysis are just two of the many topics covered by data science in order to get deeper insights and enhance a system’s performance or offerings. Additionally, it is now simpler to comprehend how any complex system works thanks to the accessibility of some of the most inventive and insightful techniques to visualise big data post-analysis. Big data must be defined since a significant portion of society is becoming aware of it and engaged in its generation. Consequently, in this review, we make an effort to present information about the role that big data has played in the change of the worldwide healthcare sector and its impact on our daily lives.
A repository for big data in healthcare
A multifaceted system called healthcare was created with the sole purpose of preventing, diagnosing, and treating human health problems or impairments. Health experts (physicians or nurses), healthcare facilities (clinics, hospitals for the delivery of medications and other diagnosis or treatment technologies), and a funding institution supporting the first two are the main parts of a healthcare system.
The health professionals work in a variety of fields related to health, including nursing, psychiatry, physiotherapy, dentistry, and many more. Various levels of healthcare are necessary depending on how urgent the problem is. The first point of contact for primary care, skilled professional acute care, sophisticated medical investigation and treatment, and extremely rare diagnostic or surgical treatments are all provided by professionals (quaternary care). Health professionals are in charge of various types of information, including patient medical history (information pertaining to diagnoses and prescriptions), medical and clinical data (such as data from imaging and laboratory examinations), and other private or individual medical information, at each of these levels.
It used to be standard procedure to type or handwrite reports or notes in order to store such medical data for a patient. Even the outcomes of medical exams were kept in a paper filing system. The earliest case records of this procedure can be found in an Egyptian papyrus text from 1600 BC, which proves that it has been around for a very long time. The clinical case records, in Stanley Reiser’s words, “capture the event of disease as a novel in which the patient, family, and doctor are a part of the storyline.”
The digitalization of all clinical exams and medical records in healthcare systems has now become a regular and widely used practise thanks to the development of computer systems and its capabilities. The phrase “electronic health records” was chosen in 2003 by the Institute of Medicine, a division of the National Academies of Sciences, Engineering, and Medicine, to describe information kept for enhancing the health care industry for the benefit of patients and physicians. According to Murphy, Hanken, and Waters, “Electronic health records (EHR) are computerised medical records for patients that include any information about a person’s past, present, or future physical or mental health or condition that is stored in electronic system(s) used to capture, transmit, receive, store, retrieve, link, and manipulate multimedia data for the main purpose of providing healthcare and health-related services.”
Healthcare digitization and big data
An electronic medical record (EMR), like an electronic health record (EHR), keeps track of the usual clinical and medical information obtained from patients. Medical practise management software (MPM), personal health records (PHR), electronic health records (EHRs), and many other healthcare data elements offer the potential to lower healthcare expenditures while also increasing the quality and efficiency of services. Big data in healthcare consists of payer-provider data (such as EMRs, pharmacy prescriptions, and insurance records) as well as genomics-driven studies (such as genotyping, gene expression data), as well as other data gathered through the Internet of Things (IoTsmart )’s web.
EHR adoption was sluggish at the start of the twenty-first century, but it significantly increased after 2009. Information technology has become more and more important for the management and use of such healthcare data. A real-time biomedical and health monitoring system has sped up the development and use of wellness monitoring tools and related software that can send alarms and communicate patient health information with the appropriate healthcare practitioners. These gadgets are producing enormous amounts of data, which may be examined to offer clinical or medical care in real time. Big data from the healthcare industry holds promise for enhancing health outcomes and reducing expenses.
Biomedical research using big data
A biological system, like a human cell, demonstrates complex molecular and physical interactions. In a biomedical or biological experiment, data is typically collected on a smaller and/or simpler component in order to better understand the interdependencies of many components and events of such a complex system. As a result, it takes several reduced trials to produce a comprehensive map of a particular biological phenomenon of interest. This suggests that the more data we have, the greater our comprehension of biological processes will be. This concept has greatly accelerated the development of current approaches. One can envision the volume of data that has been produced since the integration of effective technologies like next-generation sequencing (NGS) and genome-wide association studies (GWAS) to unravel human genetics, for instance.
NGS-based data provides information at depths that were previously inaccessible and takes the experimental scenario to a completely new dimension. It has increased the resolution at which we observe or record biological events associated with specific diseases in a real time manner. The idea that large amounts of data can provide us a good amount of information that often remains unidentified or hidden in smaller experimental methods has ushered-in the ‘-omics’ era. The ‘omics’ discipline has witnessed significant progress as instead of studying a single ‘gene’ scientists can now study the whole ‘genome’ of an organism in ‘genomics’ studies within a given amount of time. Similarly, instead of studying the expression or ‘transcription’ of single gene, we can now study the expression of all the genes or the entire ‘transcriptome’ of an organism under ‘transcriptomics’ studies.
Each of these individual tests produces a substantial amount of data that is more detailed than ever. However, this depth and resolution could not offer all the information needed to fully describe a particular mechanism or occurrence. In order to achieve new ideas, one frequently finds themselves evaluating a significant amount of data gathered from several studies. A steady increase of publications on the use of big data in healthcare attests to this fact. The analysis of such large amounts of data from the medical and healthcare systems can greatly aid in the development of new healthcare policies. The most recent technical advancements in data gathering, collection, and analysis have increased hopes for an impending revolution in customised medicine.
Characteristics of big data in healthcare
Due to their abundance of data, EHRs can support clinical decision-making and enable advanced analytics. But today, a sizable percentage of this data is unstructured. Information that does not follow a pre-established model or organisational system is considered unstructured data.
The fact that we can record it in a variety of formats may be the only factor influencing our decision. Another justification for choosing an unstructured format is that structured input methods like drop-down menus, radio buttons, and check boxes sometimes struggle to capture complicated data. For instance, we are only able to capture non-standard information in an unstructured manner when it comes to a patient’s clinical suspicions, socioeconomic data, patient preferences, important lifestyle aspects, and other relevant information. It is challenging to combine such disparate but crucial sources of data into a comprehensible or uniform data format for subsequent algorithmic analysis to comprehend and improve patient care. Nonetheless, the healthcare industry is required to utilize the full potential of these rich streams of information to enhance the patient experience.
It might manifest in the form of improved management, care, and affordable therapies in the healthcare industry. We are a long way from fully utilising the advantages of big data and capturing the insights that result from it. We must systematically handle and analyse the large data in order to accomplish these objectives.