Data mining on GRBio (2023) publications list


Alex Sanchez-Pla


July 11, 2024


This document shows how to perform some exploratory analysis of GRBio resarch group production during 2023 and 2024 through the analysis of a bibtex file containing diverse type of references.

Get the data

# Leer el archivo .bib
# Convertir el archivo .bib a un dataframe
bibfile <- "Articles/GRBioUB_refsAll_202324.bib"
bib_df <- bib2df(bibfile)
# str(bib_df)

Basic information

Number of entries

num_articles <- nrow(bib_df)
[1] 44

Number of authors per paper

bib_df <- bib_df %>%
  mutate(num_authors = sapply(AUTHOR, function(x) length(unlist(strsplit(x, " and "))))) 

numAuthors <- bib_df$num_authors[bib_df$num_authors < 100]

hist(numAuthors, main="Distribution of number of authors per paper")

# summary(numAuthors)
t(skimr::skim (numAuthors))
skim_type     "numeric" 
skim_variable "data"    
n_missing     "0"       
complete_rate "1"       
numeric.mean  "11.55814"    "10.66213"
numeric.p0    "3"       
numeric.p25   "4"       
numeric.p50   "9"       
numeric.p75   "12.5"    
numeric.p100  "59"      
numeric.hist  "▇▂▁▁▁"   


journal_freq <- bib_df %>%
  filter(! %>%
  count(JOURNAL) %>%

# Truncar nombres de revistas demasiado largos
journal_freq <- journal_freq %>%
  mutate(JOURNAL = ifelse(nchar(JOURNAL) > 50, paste0(substr(JOURNAL, 1, 50), "..."), JOURNAL))

journal_freq %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F)
World Journal of Advanced Research and Reviews 3
Briefings in Bioinformatics 2
Frontiers in endocrinology 2
Nutrients 2
AIMS Mathematics 2023 10:23218 1
Age and ageing 1
American journal of perinatology 1
Applied Soft Computing 1
Ciencia Latina Revista Científica Multidisciplinar 1
EClinicalMedicine 1
Earth-Science Reviews 1
Entomological Science 1
European journal of neurology 1
Food & function 1
Frontiers in Immunology 1
Frontiers in Pediatrics 1
Frontiers in nutrition 1
Genome Biology 1
Insects 1
International Journal of Hygiene and Environmental... 1
International journal of molecular sciences 1
Journal of gastrointestinal surgery : official jou... 1
Journal of glaucoma 1
Methods in molecular biology (Clifton, N.J.) 1
Molecular nutrition & food research 1
Neurobiology of stress 1
Neurogastroenterology and motility 1
Neurology 1
Neurology: Neuroimmunology and NeuroInflammation 1
New Insights on Principal Component Analysis 1
PLOS One 1
Pediatric blood & cancer 1
Pharmaceuticals (Basel, Switzerland) 1
RIED-Revista Iberoamericana de Educacion a Distanc... 1
Revista Chilena de Salud Pública 1
Studies in Systems, Decision and Control 1
Surgery for Obesity and Related Diseases 1
The EMBO Journal 1
Weed Research 1
journal_plot <- ggplot(journal_freq, aes(x = reorder(JOURNAL, -n), y = n)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  scale_y_continuous(breaks = scales::pretty_breaks(n = 4)) +
  labs(title = "Frecuencia de Publicaciones por Revista",
       x = "Revista",
       y = "Frecuencia") +
  theme(axis.text.x = element_text(size = 10),
        axis.text.y = element_text(size = 6)) 


Most common keywords

keywords <- bib_df %>%
  filter(! %>%
  unnest_tokens(word, KEYWORDS, token = "regex", pattern = ",") %>%
  count(word, sort = TRUE)

titlewords <- bib_df %>%
  filter(! %>%
  unnest_tokens(word, TITLE, token = "regex", pattern = " ") %>%
  filter(!word %in% stopwords::stopwords("en")) %>%
  count(word, sort = TRUE)

# Generar la nube de palabras
wordcloud(words = keywords$word, freq = keywords$n, min.freq = 1,
          max.words = 200, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

wordcloud(words = titlewords$word, freq = titlewords$n, min.freq = 1,
          max.words = 200, random.order = FALSE, colors = brewer.pal(8, "Dark2"))

Publications 2023-24

Adel, Maja R., Ester Antón-Galindo, Edurne Gago-Garcia, Angela Arias-Dimas, Concepció Arenas, Rafael Artuch, Bru Cormand, and Noèlia Fernàndez-Castillo. 2024. “Decreased Brain Serotonin in Rbfox1 Mutant Zebrafish and Partial Reversion of Behavioural Alterations by the SSRI Fluoxetine.” Pharmaceuticals (Basel, Switzerland) 17 (February).
Anton, Alfonso, David Serrano, Karen Nolivos, Gianluca Fatti, Natasa Zmuc, Carlos Crespo, Toni Monleon-Getino, et al. 2023. “Cost-Effectiveness of Screening for Open Angle Glaucoma Compared with Opportunistic Case Finding.” Journal of Glaucoma 32 (February): 72–79.
Ayala, Nicolas, Antonio Monleon-Getino, Jaume Canela-Soler, and Tomas Chadwick-Lobos. 2023. “Random Forest Using Smartphone GPS in First Wave of COVID-19 in the Maule Region, Chile.” World Journal of Advanced Research and Reviews 2023: 531–36.
Ayala-Aldana, Nicolas, Antonio Monleon-Getino, Jaume Canela-Soler, Silvia Sancliment i Guitart, Carmen Serrano-Munuera, Nicolas Ayala-Aldana, Antonio Monleon-Getino, Jaume Canela-Soler, Silvia Sancliment i Guitart, and Carmen Serrano-Munuera. 2023. “COVID-19 Temporal Dynamic as a Hospital Management Method: Case Study of the Martorell Hospital.” World Journal of Advanced Research and Reviews 17 (January): 856–61.
Ayala-Aldana, Nicolas, Antonio Monleon-Getino, Jaume Canela-Soler, Petia Radeva, Javier Rodenas, Nicolas Ayala-Aldana, Antonio Monleon-Getino, Jaume Canela-Soler, Petia Radeva, and Javier Rodenas. 2023. “A Bootstrap Method Based on Linear Regression to Estimate COVID-19 Ecological Risk in Catalonia.” World Journal of Advanced Research and Reviews 17 (January): 324–32.
Ayala-Aldana, Nicolas, Antonio Monleon-Getino, Jaume Canela-Soler, and Erika Retamal-Contreras. 2023. “Predicción Con Modelo ARIMA En Series Temporales de Salmonella Spp En Chile Entre 2014-2022.” Ciencia Latina Revista Científica Multidisciplinar 7 (January): 1337–51.
Ayala-Aldana, Nicolas, Antonio Monleón-Getino, and Jaume Canela Soler Jaume Canela Soler. 2023. “Desafíos Para El Análisis Predictivo En El Pronóstico Clínico: Índice de Charlson Como Predictor Clásico En Epidemiología.” Revista Chilena de Salud Pública 26 (July): 238.
Besalú, Mireia, Giulia Binotto, Mireia Besalú, and Giulia Binotto. 2023. “Time-Dependent Non-Homogeneous Stochastic Epidemic Model of SIR Type.” AIMS Mathematics 2023 10:23218 8: 23218–46.
Campaña, María, Rafael del Hoyo, Antonio Monleón-Getino, and Javier Checa. 2023. “Predicting Legionella Contamination in Cooling Towers and Evaporative Condensers from Microbiological and Physicochemical Parameters.” International Journal of Hygiene and Environmental Health 248 (March).
Carmona-Maurici, Júlia, Araceli Rosa, Natalia Azcona-Granada, Elionora Peña, David Ricart-Jané, Anna Viñas, Maria Dolores López-Tejero, et al. 2023. “Irisin as a Novel Biomarker of Subclinical Atherosclerosis in Severe Obesity.” International Journal of Molecular Sciences 24 (May).
Corral-Pujol, Marta, Berta Arpa, Estela Rosell-Mases, Leire Egia-Mendikute, Conchi Mora, Thomas Stratmann, Alex Sanchez, et al. 2023b. “NOD Mouse Dorsal Root Ganglia Display Morphological and Gene Expression Defects Before and During Autoimmune Diabetes Development.” Frontiers in Endocrinology 14.
———, et al. 2023a. “NOD Mouse Dorsal Root Ganglia Display Morphological and Gene Expression Defects Before and During Autoimmune Diabetes Development.” Frontiers in Endocrinology 14.
David, Romain, Arina Rybina, Jean‐Marie Burel, Jean‐Karim Heriche, Pauline Audergon, Jan‐Willem Boiten, Frederik Coppens, et al. 2023. ‘Be Sustainable’: EOSC‐life Recommendations for Implementation of FAIR Principles in Life Science Data Handling.” The EMBO Journal 42 (December).
Delclòs, Xavier, Enrique Peñalver, Eduardo Barrón, David Peris, David A. Grimaldi, Michael Holz, Conrad C. Labandeira, et al. 2023. “Amber and the Cretaceous Resinous Interval.” Earth-Science Reviews 243 (August).
Fernández, Angela Gregoraci, Juan José Comuñas Gómez, Olalla Rodriguez-Losada, Vanessa Flores España, Anna Gros Turpin, Santiago Pérez Hoyos, and Félix Castillo Salinas. 2023. “Nasal High-Flow for Weaning Preterm Newborns with Risk of Chronic Lung Disease from nCPAP.” American Journal of Perinatology 40 (October): 937–44.
Fissolo, Nicolas, Laura Calvo-Barreiro, Herena Eixarch, Ursula Boschert, Luisa M. Villar, Lucienne Costa-Frossard, Mireia Ferrer, et al. 2023. “Molecular Signature Associated with Cladribine Treatment in Patients with Multiple Sclerosis.” Frontiers in Immunology 14 (July): 1233546.
Fissolo, Nicolás, Agustin Pappolla, Jordi Rio, Luisa M. Villar, Santiago Perez-Hoyos, Alex Sanchez, Lucía Gutierrez, Xavier Montalban, and Manuel Comabella. 2023. “Serum Levels of CXCL13 Are Associated with Teriflunomide Response in Patients with Multiple Sclerosis.” Neurology: Neuroimmunology and NeuroInflammation 10 (January).
Flores-Arriaga, Joel, María C. Aso, Arantzazu Izagirre, Ami D. Sperber, Olafur S. Palsson, Shrikant I. Bangdiwala, Ángel Lanas, et al. 2023. “Prevalence and Description of Disorders of Gut-Brain Interaction in Spain According to the Results of the Rome Foundation Global Epidemiology Study.” Neurogastroenterology and Motility 35 (June).
Garrido-Martín, Diego, Miquel Calvo, Ferran Reverter, and Roderic Guigó. 2023. “A Fast Non-Parametric Test of Association for Multiple Traits.” Genome Biology 24 (December): 1–32.
Geertsema, J., M. Kratochvil, R. González-Domínguez, S. Lefèvre-Arbogast, D. Y. Low, A. Du Preez, H. Lee, et al. 2024. “Coffee Polyphenols Ameliorate Early-Life Stress-Induced Cognitive Deficits in Male Mice.” Neurobiology of Stress 31 (July).
Gregori, Josep, Àlex Sánchez, and Josep Villanueva. 2023. “msmsEDA & msmsTests: Label-Free Differential Expression by Spectral Counts.” Methods in Molecular Biology (Clifton, N.J.) 2426: 197–242.
H, Mostafa, Meroño T, Miñarro A, Sanchez-Pla A, Lanuza F, Zamora-Ros R, Rostgaard-Hansen AL, et al. 2023. “Dietary Sources of Anthocyanins and Their Association with Metabolome Biomarkers and Cardiometabolic Risk Factors in an Observational Study.” Nutrients, July.
Irigoien, Itziar, Susana Ferreiro, Basilio Sierra, and Concepción Arenas. 2023. “Fuzzy Classification with Distance-Based Depth Prototypes: High-Dimensional Unsupervised and/or Supervised Problems.” Applied Soft Computing 148 (November): 110917.
Koenig, Franz, Cécile Spiertz, Daniel Millar, Sarai Rodríguez-Navarro, Núria Machín, Ann Van Dessel, Joan Genescà, et al. 2023. “Current State-of-the-Art and Gaps in Platform Trials: 10 Things You Should Know, Insights from EU-PEARL.” EClinicalMedicine 67 (January).
Madrenas, Raquel, Joan Balanyà, Concepció Arenas, Manhaz Khadem, and Francesc Mestres. 2020. “Global Warming and Chromosomal Inversion Adaptation in Isolated Islands: Drosophila Subobscura Populations from Madeira.” Entomological Science 23 (March): 74–85.
Mas-Bermejo, Patricia, Natalia Azcona-Granada, Elionora Peña, Albert Lecube, Andreea Ciudin, Rafael Simó, Alexis Luna, et al. 2024. “Genetic Risk Score Based on Obesity-Related Genes and Progression in Weight Loss After Bariatric Surgery: A 60-Month Follow-up Study.” Surgery for Obesity and Related Diseases 0.
Midaglia, Luciana, Alex Rovira, Berta Miró, Jordi Río, Nicolás Fissolo, Joaquín Castilló, Alex Sánchez, Xavier Montalban, and Manuel Comabella. 2024. “Association of Magnetic Resonance Imaging Phenotypes and Serum Biomarker Levels with Treatment Response and Long-Term Disease Outcomes in Multiple Sclerosis Patients.” European Journal of Neurology 31 (January).
Miret, Sergi Baena, Ferran Reverter Comes, and Esteban Vegas Lozano. 2024. “A Framework for Block-Wise Missing Data in Multi-Omics.” PLOS One, July.
Monleón-Getino, A., G. Pujol-Muncunill, J. Méndez Viera, L. Álvarez Carnero, W. Sanseverino, A. Paytuví-Gallart, and J. Martín de Carpí. 2023. “A Pilot Study of the Use of the Oral and Faecal Microbiota for the Diagnosis of Ulcerative Colitis and Crohn’s Disease in a Paediatric Population.” Frontiers in Pediatrics 11 (November): 1220976.
Mostafa, Hamza, Tomás Meroño, Antonio Miñarro, Alex Sánchez-Pla, Fabián Lanuza, Raul Zamora-Ros, Agnetha Linn Rostgaard-Hansen, et al. 2023. “Dietary Sources of Anthocyanins and Their Association with Metabolome Biomarkers and Cardiometabolic Risk Factors in an Observational Study.” Nutrients 15 (March): 1208.
Palma-Guillén, Alfred, Miquel Salicrú, Ariadna Nadal, Xavier Serrat, and Salvador Nogués. 2024. “Non-Chemical Weed Management for Sustainable Rice Production in the Ebro Delta.” Weed Research 64 (June): 227–36.
Peña, Elionora, Patricia Mas-Bermejo, Albert Lecube, Andreea Ciudin, Concepción Arenas, Rafael Simó, Mercedes Rigla, Assumpta Caixàs, and Araceli Rosa. 2024. “Use of Polygenic Risk Scores to Assess Weight Loss After Bariatric Surgery: A 5-Year Follow-up Study.” Journal of Gastrointestinal Surgery : Official Journal of the Society for Surgery of the Alimentary Tract, May.
Pons, Joana Villalonga, Mireia Besalú, Anna Samà Camí, and Teresa Sancho-Vinuesa. 2023. “Online Engineering Students’ Learning Strategies.” RIED-Revista Iberoamericana de Educacion a Distancia 26 (July): 237–56.
Preez, Andrea Du, Sophie Lefèvre-Arbogast, Raúl González-Domínguez, Vikki Houghton, Chiara de Lucia, Hyunah Lee, Dorrain Y. Low, et al. 2024. “Association of Dietary and Nutritional Factors with Cognitive Decline, Dementia, and Depressive Symptomatology in Older Individuals According to a Neurogenesis-Centred Biological Susceptibility to Brain Ageing.” Age and Ageing 53 (May): ii47–59.
Puyó, Pablo Velasco, Alfredo Tagarro, Susana Garcia-Obregon, Olatz Villate Bejarano, Cinta Moraleda, Jorge Huerta Aragonés, Eduardo J. Bardón Cancho, et al. 2024. “Cancer Is Not a Risk Factor for Severe COVID-19 in Children, Except in Patients with Recent Allogeneic Hematopoietic Stem Cell Transplantation or Comorbidities.” Pediatric Blood & Cancer 71.
Rio-Aige, Karla, Aina Fernández-Bargalló, Esteban Vegas-Lozano, Antonio Miñarro-Alonso, Margarida Castell, Marta Selma-Royo, Cecilia Martínez-Costa, Maria José Rodríguez-Lagunas, Maria Carmen Collado, and Francisco José Pérez-Cano. 2023. “Breast Milk Immune Composition Varies During the Transition Stage of Lactation: Characterization of Immunotypes in the MAMI Cohort.” Frontiers in Nutrition 10.
Rodriguez-Luna, David, Olalla Pancorbo, Laura Llull, Yolanda Silva, Luis Prats-Sanchez, Marián Muchada, Salvatore Rudilosso, et al. 2024. “Effects of Achieving Rapid, Intensive, and Sustained Blood Pressure Reduction in Intracerebral Hemorrhage Expansion and Functional Outcome.” Neurology 102 (May): e209244.
Rujano, Maria A., Jan Willem Boiten, Christian Ohmann, Steve Canham, Sergio Contrino, Romain David, Jonathan Ewbank, et al. 2024. “Sharing Sensitive Data in Life Sciences: An Overview of Centralized and Federated Approaches.” Briefings in Bioinformatics 25 (May).
Salicrú, Miquel, Ferran Reverter, Mireia Besalú, and Moises Burset. 2023. “Inference with Median Distances: An Alternative to Reduce the Influence of Outlier Populations.” Studies in Systems, Decision and Control 445: 439–46.
Tor-Roca, Alba, Alex Sánchez-Pla, Aniko Korosi, Mercè Pallàs, Paul J. Lucassen, Pol Castellano-Escuder, Ludwig Aigner, et al. 2023. “A Mediterranean Diet-Based Metabolomic Score and Cognitive Decline in Older Adults: A Case-Control Analysis Nested Within the Three-City Cohort Study.” Molecular Nutrition & Food Research.
Unión-Caballero, Andrea, Tomás Meroño, Raúl Zamora-Ros, Agnetha Linn Rostgaard-Hansen, Antonio Miñarro, Alex Sánchez-Pla, Núria Estanyol-Torres, et al. 2024. “Metabolome Biomarkers Linking Dietary Fibre Intake with Cardiometabolic Effects: Results from the Danish Diet, Cancer and Health-Next Generations MAX Study.” Food & Function 15 (January): 1643–54.
Vegas, Esteban, Lluís Serra, Ferran Reverter, Josep Maria Oller, Esteban Vegas, Lluís Serra, Ferran Reverter, and Josep Maria Oller. 2023. “Unveiling Chromosome Changes Compatible with Climate Warming.” New Insights on Principal Component Analysis, November.
Zawisza-Álvarez, Michał, Jesús Peñuela-Melero, Esteban Vegas, Ferran Reverter, Jordi Garcia-Fernàndez, and Carlos Herrera-Úbeda. 2024. “Exploring Functional Conservation in Silico: A New Machine Learning Approach to RNA-Editing.” Briefings in Bioinformatics 25 (May): 332.
Zivanovic, Goran, Concepció Arenas, and Francesc Mestres. 2023. “The Adaptive Value of Chromosomal Inversions and Climatic Change—Studies on the Natural Populations of Drosophila Subobscura from the Balkans.” Insects 14 (July): 596.