Questions
Research at the interface of evolution, informatics, and ecology
Phylogenomics, Evolution and Bioinformatics by Lonely Joe Parker
Biology is changing. In the space of a decade the mechanics of reading living things' DNA codes has moved from a specialised job taking weeks and hundreds of thousands of pounds, to a simple procedure many people could carry out in their own home. The implications for evolutionary biology, genetics, medicine, agriculture and conservation are profound. The challenge is to analyse this torrent of data. I build bioinformatics apps to automatically process DNA data immediately, as it is generated: I want to examine and analyse living organisms’ DNA sequences in the field - as simply, quickly, and cheaply as we measure their height, weight or any other aspect of their physical appearance.
Over time, organisms’ DNA sequences evolve in response to their changing environments, competitors and predators. By comparing similar gene sequences between species and individuals, we can use the numbers and patterns of these changes to infer their evolutionary history – for instance, when did two species diverge? Which genes were the most important for their survival? How many individuals where there in each population, and how have they spread across the globe?
These ‘phylogenetic’ studies have now shifted into a whole new gear as both computing power and sequencing ability (the speed and cost to read letters of DNA from a genome) have expanded by several orders of magnitude. We’re discovering that, although the basic principles of molecular evolution hold true, the variety and detail by which these patterns are realised in DNA sequences reflect the infinite multiplicity of physical forms seen in the natural world.
I am a Research Fellow in Phylogenomics at the National Biofilms Innovation Centre and a Fellow of the Software Sustainability Institute.
The coming ubiquity of both portable DNA sequencers and cloud computation mean scenarios formerly found in sci-fi films (instant DNA analysis) are coming, soon. I'm developing methods to streamline DNA sequence analysis using cloud computation.
View details »
Up to 80% of the microscopic organisms on the Earth exist not as solitary cells, but 'biofilms'. These are complex, three-dimensional slimy structures where bacteria (and other microorganisms) co-exist, resisting our attempts to remove or kill them with antibiotics.
View details »
Modern DNA sequencers are highly portable, compared to lab-bound models of a decade ago. I'm trialling field-based sequencing using the MinION USB sequencer - a palm-size device with potential to revolutionise environmental metagenomics and turbotaxonomy.
View details »
Phylogenomic big-data allows us to detect statistical patterns with weak effects, such as adaptive convergent molecular evolution. I'm also interested in patterns of gene family evolution, homology, and divergent adaptive selection.
View details »
Phylogenomic models accounting for uncertainty require useful metrics on tree space - the 'distance' between two or more phylogenetic trees. However few useful such measures exist and I'm hunting for more...
View details »
The vast scale of bioinformatics datasets currently being assembled require models of asynchronous computation; meta-algorithms where model areas are updated asynchronously on separate machines.
View details »
Development of sustainable software and open research norms is a priority for big-data empirical bioscience in the 21st centrury, to avoid the 'reproducibility crisis'. I'm a Fellow of the SSI.
View details »
I'm interested in the parallels and divergences between the natural world (in a systems biology context) and organisation of human societies. Maybe I'll get to take a sabbatical one day!
View details »
See Google Scholar for the most recent...
Our paper on rapid identification of samples using partial, low-coverage, MinION-sequenced reference databases for ID (at the Kew Science Festival) is in preprint. See here on BiorXiv: doi: 10.1101/281048. In it, we show (with empirical data and simulation) that the … Continue reading
Talk presented at the UK-India Joint Bioinformatics Workshop, Pirbright Institute, 09 Feb 2018 Abstract: In a globalised world of increasing trade, novel threats to animal and plant health, as well as human diseases, can cross political and geographical borders spontaneously … Continue reading
Talk given at a technology/informatics company, London, Feb 2018. An overview of contemporary advances and remaining problems in big-data biology, especially phylogenomics. Tweet this Digg Post to LinkedIn Slashdot Stumble This
Dead excited to say our Nature Science Reports paper on field-based DNA extraction, sequencing (and a bit of analysis) has been picked up by the BBC World Service and The Times (UK) newspaper! You can read all about it here … Continue reading
Short lecture relating my recent work on real-time phylogenomics, implications for bioinformatics research and future directions of genomic/phylogenetic modelling to explicitly account for phylogeny, synteny and identity through coloured graphs. University of Reading, 2nd August 2017 Slides [SlideShare]: cc-by-nd Tweet … Continue reading
Invited seminar at the Department of Zoology, Oxford University, 30th November 2016. Summary of our field-based real-time phylogenomics (MinION DNA sequencing) experiments this year, and applicability to broad-scale tree-of-life phylogenomics and macroevolutionary biology. Slides [SlideShare]: cc-by-nd Tweet this Digg Post … Continue reading
A short presentation to the British Society for Plant Pathology’s ‘Grand Challenges in Plant Pathology’ workshop on the uses of real-time DNA/RNA sequencing technology for plant health applications. Doctoral Training Centre, University of Oxford, 14th September 2016. Slides [SlideShare]: cc-by-nc-nd … Continue reading
Talk presented at the #bench16 (benchmarking) symposium at KCL, London, Wed 20th April 2016. Funded by the SSI. Slides (Slideshare – cc-by-nd) Tweet this Digg Post to LinkedIn Slashdot Stumble This
General science talk about the potential of real-time phylogenomics, delivered at the Jodrell Lecture Theatre, Kew Gardens, November 2nd 2015 Slides [SlideShare]: cc-by-nc-nd Tweet this Digg Post to LinkedIn Slashdot Stumble This
Presentation on lightweight bioinformatics (Raspi / cloud computing) for real-time field-based analyses. Presented at iEOS2015, St. Andrews, 3-6th July 2015. Slides [SlideShare]: cc-by-nc-nd Tweet this Digg Post to LinkedIn Slashdot Stumble This
In prep. SUMMARY Building on work presented previously (Parker et al., 2008), we study a number of more complex measures of phylogeny-trait association (implemented in the program Befi-BaTS / BaTS v0.10.1) which take into account the branch lengths of a … Continue reading
In prep. Manuscripts in progress (all rights reserved – you may not copy or distribute these files; content and conclusions subject to change; strictly embargoed until publication in a peer-reviewed journal/book): v1: .doc Tweet this Digg Post to LinkedIn Slashdot … Continue reading
In prep. (v3 – 14 Jun 2017) Summary. The CONTEXT (COmparative Nucleotides and Trees Exploration Tool) is a phylogenomics dataset browser that consists of a Java API and an executable binary jarfile with graphical user interface (GUI) for the high-throughput analysis … Continue reading
In prep. (v2 – 21 April 2015) Abstract Convergent evolution is a process by which neutral evolutionary processes and adaptive natural selection in response to niche specialisation lead to similar forms arising in unrelated taxa. Phenotypic convergence has been appreciated … Continue reading
In prep. (v0 – 24 February 2015) Summary. Genome Convergence Pipeline consists of a Java API and an executable binary jarfile with graphical user interface (GUI) for the high-throughput analysis of phylogenomic datasets to detect convergent molecular evolution. Motivation. Although … Continue reading
Seminar presented at the Maths Department, University of Portsmouth, 19th November 2014 Evolutionary biologists represent actual or hypothesised evolutionary relations between living organisms using phylogenies, directed bifurcating graphs (trees) that describe evolutionary processes in terms of speciation or splitting events … Continue reading
Talk presented at the 18th Evolutionary Biology Meeting At Marseille (programme), 16th-19th September 2014. (Powerpoint – note this is a draft, not the final talk, pending authorisation): EBMdraft Tweet this Digg Post to LinkedIn Slashdot Stumble This
High-throughput comparative genomics Research seminar presented for MSc students at University College Dublin, 24rd October 2013. Invited by Prof. Emma Teeling’s lab at UCD. Powerpoint: UCD_MSc_phylogenomics_joeParker_edit Tweet this Digg Post to LinkedIn Slashdot Stumble This
Exciting news from the lab this week… we’ve published in one of the leading journals, Nature!!! Much of my work in the Rossiter BatLab for the last couple of years has centred around the search for genomic signatures of molecular … Continue reading
Seminar presented at the Tropical Biodiversity in the 21st Century symposium, held at the Natural History Museum, London on the 3rd & 4th June 2013 (programme). Powerpoint: High-throughput computing and phylogenomics Tweet this Digg Post to LinkedIn Slashdot Stumble This
BMC Evol Biol. 2011 May 19;11(1):131. [Epub ahead of print] Gray RR*, Parker J*, Lemey P, Salemi M, Katzourakis A, Pybus OG. *These authors contributed equally to this article. BACKGROUND: Hepatitis C virus (HCV) is a rapidly-evolving RNA virus that … Continue reading
J Virol. 2011 May 18. [Epub ahead of print] Clegg SR, Coyne KP, Parker J, Dawson S, Godsall SA, Pinchbeck G, Cripps PJ, Gaskell RM, Radford AD. Canine parvovirus 2 (CPV-2) is a severe enteric pathogen of dogs, causing high … Continue reading
PLoS Pathog. 2010 Sep 2;6(9):e1001084. Ozkaya Sahin G, Bowles EJ, Parker J, Uchtenhagen H, Sheik-Khalil E, Taylor S, Pybus OG, Mäkitalo B, Walther-Jallow L, Spångberg M, Thorstensson R, Achour A, Fenyö EM, Stewart-Jones GB, Spetz AL. Neutralizing antibodies (NAb) able … Continue reading
J Virol. 2010 Aug;84(15):7815-21. Epub 2010 May 19. Rosario M, Fulkerson J, Soneji S, Parker J, Im EJ, Borthwick N, Bridgeman A, Bourne C, Joseph J, Sadoff JC, Hanke T Although major inroads into making antiretroviral therapy available in resource-poor … Continue reading
Humphreys I, Fleming V, Fabris P, Parker J, Schulenberg B, Brown A, Demetriou C, Gaudieri S, Pfafferott K, Lucas M, Collier J, Huang KH, Pybus OG, Klenerman P, Barnes E. J Virol. 2009 Nov;83(22):11456-66. Epub 2009 Sep 9. Hepatitis C … Continue reading
A research thesis submitted for the degree of Doctor of Philosophy at the University of Oxford. J Parker Funded by: Natural Environment Research Council (UK) with support from Linacre College, Oxford. Abstract: This thesis examines the evolutionary biology of the … Continue reading
Virology. 2009 Apr 25;387(1):229-34. Epub 2009 Mar 9. Tee KK, Pybus OG, Parker J, Ng KP, Kamarulzaman A, Takebe Y. HIV is capable of frequent genetic exchange through recombination. Despite the pandemic spread of HIV-1 recombinants, their times of origin … Continue reading
Infect Genet Evol. 2008 May;8(3):239-46. Epub 2007 Aug 21. Parker J, Rambaut A, Pybus OG. Many recent studies have sought to quantify the degree to which viral phenotypic characters (such as epidemiological risk group, geographic location, cell tropism, drug resistance … Continue reading
I'm a public-funded scientist and an advocate of Open Data and Reproducible Research. My previous work as a postdoc has been funded via a variety of means and published under multiple licenses, but source data for most of my publications is available. If you want workflow scripts and software please email me and I'll try to help where I can.
For my own work I now use GitHub extensively to document and version-control my analyses; I also use Endnote a lot and complete notebooks will be published with each publication.
Obviously truly reproducible research is quite a large step on from 'give us your short reads and executables' - a complete bioinformatics analysis might include several people on multiple machines - and documenting all these steps is a kew challenge for reproducibility, a cornerstone of empirical research. I'm exploring the use of Docker containers, iPython notebooks and Research Objects to make it simpler for me to document, reproduce and communicate my research.
I am moving a substantial proportion of my compute load to cloud resources, in particular Amazon's EC2. At present one machine image is available, from my 'lightweight bioinformatics' project. Search AWS AMIs for 'ami-90296be7'.
My work with field-based MinION DNA sequencing is supported by a Pilot Study Fund grant from the Kew Foundation.
The Phylo-Hackathon project is supported by a Fellowship grant from the Software Sustainability Institute.
Previous work as a postdoc and PhD student has been funded by NERC, the BBSRC, the MRC, the European Research Council, the Royal Society and the Daiwa Foundation.
I am currently employed in the Biodiversity Informatics & Spatial Analysis department of the Science Directorate at the Royal Botanic Gardens, Kew in London.
Current collaborators include:
I'm also available to provide consultancy services to private partners on big-data projects in genomics, phylogenomics, bioinformatics/informatics and statistics. This work is delivered via Kitson Consulting.
I was inspired to give my sad-looking Wordpress site the boot (and a kick up my own arse) by the very, very excellent Bedford Lab website. However, although that site uses loads of cool technologies (like CMS/source code control managed directly on GitHub, compiled to static HTML via Jekyll, all served up on a Heroku instance...) I reckoned it was overkill for me.
Instead I've nicked some ideas from that site (layouts in Bootstrap, fonts from Typekit) but haven't completely jettisoned the old (Wordpress) CMS running on LAMP yet - partly because I haven't got the time to write a good parser for all that legacy content, and partly because I still blog there about non-science things.
So the site you see is generated simply from a (largely) static HTML file with a couple of bits of PHP pulling in Wordpress posts to populate the blog and publications. Parallax effects use Aen Tan's Parallax-Scroll code and I figured out the integration with Bootstrap (actually pretty simple) with a lot of help from this tutorial. The whole site probably took less than 20 hours to put together including design, parsing WP, and deployment and testing - I think that's pretty good, on balance.
At some point I might add a couple of sub-pages for projects, etc, as well as some server-side stuff to update publication counts and Github commits on the fly. But that's tomorrow, and tomorrow's a long way away. Lastly, my existing web host is a pretty good deal so I'll probably only move the big bandwidth stuff to S3, and then only if the server logs show I really need to! Suggestions welcome.
This is part 3 in a series of posts this summer conference season. It isn’t aimed at one particular LOC – I know how hard they all work – but intended as a general reflection. I’ve been to god-knows how … Continue reading
There are good chairs and bad. Surprisingly (or not, this is academia, after all..) there’s little guidance. I’ve put together a list: Start as you mean to go on. Remind speakers to keep to time (see next point) and encourage questions … Continue reading
Summer conference season is nearly over. This is the first of three posts, informed by some reflections about the nature of scientific conferences. Students often feel under a lot of pressure when giving their first public presentations. PIs, for whom a … Continue reading
Our paper on rapid identification of samples using partial, low-coverage, MinION-sequenced reference databases for ID (at the Kew Science Festival) is in preprint. See here on BiorXiv: doi: 10.1101/281048. In it, we show (with empirical data and simulation) that the … Continue reading
Talk presented at the UK-India Joint Bioinformatics Workshop, Pirbright Institute, 09 Feb 2018 Abstract: In a globalised world of increasing trade, novel threats to animal and plant health, as well as human diseases, can cross political and geographical borders spontaneously … Continue reading
Talk given at a technology/informatics company, London, Feb 2018. An overview of contemporary advances and remaining problems in big-data biology, especially phylogenomics. Tweet this Digg Post to LinkedIn Slashdot Stumble This
Dead excited to say our Nature Science Reports paper on field-based DNA extraction, sequencing (and a bit of analysis) has been picked up by the BBC World Service and The Times (UK) newspaper! You can read all about it here … Continue reading
Really proud to report that the first of our bona fide real-time phylogenomics papers is now out in Scientific Reports! In the paper we managed to show a number of things that are potentially really exciting, and I’ll get to … Continue reading
Over the past few years I’ve been developing research, which I collectively refer to as ‘real-time phylogenomics’ – and this is the name of our mini-site for MinION-based rapid identification-by-sequencing. Since our paper on this will hopefully be published soon, … Continue reading
Quick note to explain some of the differences we’ve observed working with long-read data (MinION, PacBio) for sample ID via BLAST. I’ll publish a proper paper on this, but for now: Long reads aren’t just a bit longer than Illumina data, … Continue reading
Over the last 10-20 years there’s been a revolution in academic science (or should that be ‘coup’?) where many aspects of the job have been professionalised and formalised, especially project management but management in general. This generally includes tools like … Continue reading
A short presentation to the British Society for Plant Pathology’s ‘Grand Challenges in Plant Pathology’ workshop on the uses of real-time DNA/RNA sequencing technology for plant health applications. Doctoral Training Centre, University of Oxford, 14th September 2016. Slides [SlideShare]: cc-by-nc-nd … Continue reading
Talk presented at the #bench16 (benchmarking) symposium at KCL, London, Wed 20th April 2016. Funded by the SSI. Slides (Slideshare – cc-by-nd) Tweet this Digg Post to LinkedIn Slashdot Stumble This
My last MinION post described our first experiments with this really cool new technology. I mentioned then that their standard library prep was fairly involved, and we heard that the manufacturers, Oxford Nanopore, were working on a faster, simpler library prep. We … Continue reading
Quick one this, as it’s a tricky problem I keep having to Google/SO. So I’m posting here for others but mainly myself too! Here’s the situation: you have a folder (with, ooh, let’s say 140,000 separate MinION reads, for instance…) … Continue reading
I've recorded and toured as 'Lonely Joe Parker' since 2006, hence my username on Twitter - and various other social platforms - and this website's URL. For related enquiries please contact Sotones Records. You can hear music and whatnot over on the music bits of this site.
I'm a big fan of bombay mix, Red Stripe and cycling. Not so crazy about early mornings.
I was born and raised in Southampton, one of the biggest commercial ports in the world, next to the New Forest National Park. That, and some great teachers, gave me an interest in evolution, ecology, and exploring the wide goddamn world.
I studied general biology at Imperial College, University of London (2001-2004; tutors including Andy Purvis, Mike Tristem, Tim Barraclough and Alfreid Vogler), gaining first-class honours. Subsequently I completed a D.Phil at Linacre College, University of Oxford, based in the Zoology department under Andrew Rambaut and Oliver G. Pybus (2004-2008). I developed novel phylogenetic strategies and bioinformatics pipelines for the analysis of viral pathogens' evolution, including hepatitis C virus (HCV) and human immunodeficiency virus (HIV).
From 2009-2011 I worked at the Weatherall Institute of Molecular Medicine (John Radcliffe Hospital: Univ. Oxford / Medical Research Council (UK)), developing phylogenetic and machine-learning methods for the detection of correlates between antigenicity and sequence evolution. This integrated clinical, structural, evolutionary and population genetic data leading to immunogen design and assessment in silico, trialled in vivo for the EU-funded NGIN vaccine consortium.
Between 2011-2015 I worked with Stephen Rossiter at Queen Mary, University of London on large-scale phylogenomics projects looking for signals of molecular natural selection (particularly adaptive convergent evolution) in mammals as the focal taxonomic group. I developed a rich Java API for phylogenomic analyses and authored publications including Nature (2013) and Current Biology (2013).
I currently hold an Early-Career Research Fellowship in Phylogenomics at the Royal Botanic Gardens, Kew. This post allows me wide freedom pursue my research into real-time phylogenomics, field-based DNA sequencing for turbotaxonomy and metagenomics, and integrated alignment & phylogeny models of molecular evolution.
I also lead the Informatics workpackage for the Plant & Fungal Trees Of Life project, a Kew-led initiative to reconstruct a genus-level supertree for 80% of all plant and fungal genera by 2020.
I'm available to provide consultancy services to private partners on big-data projects in genomics, phylogenomics, bioinformatics/informatics and statistics through Kitson Consulting...
...and I'm always excited by new collaborations, particularly across other domains. So if you have an idea for a project or would like to explore a PhD or masters' degree at Kew, get in touch!
*These authors contributed equally to this article.
My full academic CV is also available.