AI Finds Smoking Affects the Biological Clock

According to the centers for disease control, cigarette smoking causes more than 480,000 deaths every year in the United States. That is more deaths than HIV, illegal drug use, alcohol use, motor vehicle injuries, and firearm-related incidents combined.  

There are numerous studies that show an association between smoking and cancer, cardiovascular disease and all-cause mortality. But now, a new study published in Nature – Scientific Reports adds a new component, biological age. 

According to researchers, smoking makes people biologically older. Using the power of deep learning, the team was able to determine that smokers exhibit higher aging rates than non-smokers. According to the team, the study is the first to predict a biological age based on tobacco smoking.

“We demonstrate for the first time that smoking status can be predicted using blood biochemistry and cell count results and the recent advances in AI and machine learning,” says Dr. Olga Kovalchuk, a professor at the University of Lethbridge in Alberta, Canada.

The work was recently published in the journal Nature – Scientific Reports. The paper included a team of AI researchers and aging experts from several organizations including the University of Oxford, the Canada Center and Aging Research Laboratories, InSilico Medicine, the University of Lethbridge, the University of Copenhagen, Boston University, and the Buck Institute for Research on Aging.

“We all have a chronological age but then there is also our biological age, which is an indicator of general fitness,” says Kovalchuk. “If somebody is 35 but on a biological clock, through specific markers, it shows them at a biological age of 50, obviously they are doing something wrong. Smoking, specifically in younger people, those in their 20s, 30s, and 40s is truly harmful as it makes them biologically older.”

Using NVIDIA TITAN Xp GPUs, with the cuDNN-accelerated Keras and Theano deep learning frameworks, the team analyzed data from 149,000 blood biochemistry records linked to smoking status. 49,000 individuals in the dataset were smokers. The neural network looked for 66 different biomarkers found in the bloodstream including hemoglobin A1c, blood urea, fasting serum, glucose and serum ferritin.

Deep learning-based blood-biochemistry clocks accurately predict chronological age. (A) Prediction accuracy of the best-performing model. The model trained on 24 parameters achieved an R2 of 0.57 and an MAE of 5.7 years. (B) The design of the deep learning study that used blood-biochemistry data to predict an individual’s age. Blood samples of nonsmokers were first preprocessed and normalized as previously described8. Next, arbitrage ranking based on 320 RF models was applied to facilitate the selection of the most appropriate feature space with maximum samples available. Afterward, missing values were reconstructed using an autoregressive model with a view towards increasing the training sets, and the resulting feature sets were used to train and test DNNs for predicting patient age and smoking status. (C) Feature importance plot. Fasting glucose, sex, and RDW exhibited higher relative importance scores than other features used in model training. NoteHigh-density lipoprotein (HDL) cholesterol, low-density lipoprotein (LDL) cholesterol. RDW for red blood cell distribution width, RBC for red blood cell counts, MCV for mean corpuscular volume, ALT for alanine transaminase, MCHC for mean corpuscular hemoglobin.

The team used the data to train a set of supervised feed-forward deep neural networks to predict the chronological age of the smokers.

“What’s beautiful about AI is that we couldn’t run these calculations before because the human mind just can’t deal with these large data sets. What it looks like is a bunch of numbers, lines of numbers, and we train it what to do and then it looks for patterns,” says Kovalchuk. “We wanted to do this using nothing fancy, just general basic bloodwork that is done on every general checkup. But with this data, using AI, you can see major patterns and it’s just fascinating.”

The classifiers used in this study have the potential to provide a better statistical assessment of the prevalence of tobacco smoking. The algorithm developed for this study can also be extended to analyze the effect of tobacco on other diseases such as diabetes.

“DNNs could be used to predict health trajectories and outcomes or to evaluate the extent to which various other environmental exposures, dietary factors, and genetic risks affect health and aging,”  the researchers stated in their paper

Read more>