Plant and Animal Genome V, San Diego, USA, January 12-16 1997
Arabidopsis Workshop

The session was chaired by Renate Schmidt.
Max-Delbrueck-Laboratorium in der MPG, Carl-von-Linne-Weg 10, 50829 Koeln, Fed. Rep. Germany

This year the Plant Genome conference welcomed Animal Genomes. This increased the size and diversity of the meeting so that we could see interesting QTL analysis amongst ponies and pigs alongside wheat and barley. During this meeting there was a workshop devoted to our favourite plant, Arabidopsis. What follows is a short report of the workshop by the participating speakers and Mary Anderson.

Arabidopsis genomic sequencing at TIGR- Current status and goals for the future: Steve Rounsley

Genomic analysis of Arabidopsis thaliana: Ron Davis

The Arabidopsis EST Programme: Progress, Applications and Perspectives: Michel Delseney

Plant EST Analysis: Biological Aspects : Ernie Retzel

Arabidopsis thaliana genome analysis : Dick McCombie and Rob Martienssen

Targeted insertion mutagenesis with the En-I Transposon system in Arabidopsis : Andy Pereira

Arabidopsis genomic sequencing at TIGR- Current status and goals for the future

Steve Rounsley
TIGR, 9712 Medical Center Drive, Rockville, MD 20850

Steve Rounsley presented an overview of the approach that the TIGR group is taking to the sequencing of chromosome II. Initial BAC clones that are distributed along the chromosome are selected for sequencing, and then other BAC clones that hybridize to the same CIC YACs are selected for end sequencing. These end sequences are then used to select the minimally overlapping clones for extending the sequence walk. Steve described the different stages of the sequencing process (shotgun library production, high throughput sequences, and closure). The final step in the process at TIGR is annotation. Steve described their approach to annotation and stressed the need for providing supporting evidence for annotated genes. The current status of the project was also presented, further progress can be tracked at the TIGR web site

Genomic Analysis of Arabidopsis thaliana

Ron Davis
Stanford DNA Sequence and Technology Center, 855 California Avenue, Palo Alto, CA 94304, USA

Ron Davis described how the recently formed SPP Consortium, composed of the Stanford DNA Sequencing and Technology Center, the Plant Gene Expression Center, and the University of Pennsylvania, had started sequencing on chromosome 1 of Arabidopsis thaliana. Their strategy involves mapping BAC clones from the TAMU and IGF libraries to chromosome 1, followed by the construction of sequencing libraries in M13 and plasmid vectors. In addition, Ron described how they are developing a set of markers for Arabidopsis thaliana that can be analyzed in an automated, non-gel based manner. They have used denaturing high performance liquid chromatography (DHPLC), a novel mismatch detection technology, to identify over 200 single-nucleotide changes between the Columbia and Landsberg erecta strains of A. thaliana. Sequencing of these polymorphisms from both strains reveals a variety of substitutions and small deletions. These markers will be the basis for the construction of an Affymetrix chip which can be used to integrate the physical and genetic map. They are also initiating the arraying of Arabidopsis cDNAs on glass slides using a robotic arrayer developed at Stanford. These arrays will be hybridized with two-colour fluorescent probes for expression analysis and for identification of insertion mutant alleles in T-DNA tagged Arabidopsis strains.

The Arabidopsis EST Programme: Progress, Applications and Perspectives

Michel Delseney
LPBMV, Universite de Perpignan, 52, avenue de Villeneuve, 66860 Perpignan-Cedex, France

Michel Delseney spoke about the ESSA (European Scientists Sequencing Arabidopsis) programme, which has contributed about half the non redundant Arabidopsis thaliana ESTs in dbEST. However, with approximately 30000 ESTs, increasing redundancy makes this strategy no longer cost effective to tag new genes. Analysis of the ESTs has produced some interesting findings. Using TIGR assembler, it has been possible to reconstruct about 12000 unique contigs. This is likely an overestimate of effectively tagged genes due to sequencing errors and non-overlapping ESTs which appear to tag different genes. Further, a detailed investigation of ESTs tagging ribosomal protein genes identified 104 full length cDNAs corresponding to 50 types of non ambiguously identified ribosomal proteins. They are coded by up to four members of small multigene families. Michel described how they have mapped about 600 ESTs corresponding to genes of interest to their labs (genes expressed in etiolated seedlings, genes for LEA proteins or protein kinases), combining classical RFLP or CAPS genetic mapping and physical mapping by PCR on the CIC YAC library 3-D pools. It is now clear that many genes in Arabidopsis are organised as mosaic structures with short exons and multiple introns. Michel also described some exciting work in which they have taken 1-2 % of ESTs showing conservation at the nucleotide level between Arabidopsis and rice and used them for comparative mapping analysis using common identified markers which are likely to exist in all plant species.

Plant EST Analysis: Biological Aspects

Ernie Retzel
Medical School, University of Minnesota, Minneapolis, MN 55455, USA

Ernie Retzel discussed the biological aspects resulting from merging the data from the variety of plant EST and genomic sequencing projects. Integrating the analytical data from the US and French Arabidopsis projects with the Japanese rice project and the USDA-IFG/North Carolina State University pine project has resulted in some tantalizing preliminary results on what may be specific to Arabidopsis, and, at the very least, allows commonalities across the evolutionary spectrum to be identified. This information was developed from the results of Blastn self-comparisons, and the extraction of the results were from the relational database management system developed as part of the US/NSF-sponsored project, and are displayed as set of data common to all tested plants, to any pair of organisms, and those apparently unique to a genus.

In addition, new techniques for "clustering" of related sequences were described. These techniques also involve large-scale self-comparisons of data sets, but extend the number of variables that are examined in a process in which the variables are simultaneously optimized and the results graphically displayed. This process was tested using data which included the contigs generated by Steve Rounsley at TIGR, and the results reflected the very high stringency utilized to generate these contigs. In addition, the visualization allows further insights into relationships not directly based on functional similarity. This technique, as it develops, will break the ground for the exploration of higher order relationships of sequences.

Arabidopsis thaliana genome analysis

Dick McCombie and Rob Martienssen
Cold Spring Harbor Lab, P. O. Box 100 , Cold Spring Harbor, NY, 11724, USA

Dick McCombie and Rob Martienssen made a joint presentation with Dick talking about the sequencing efforts and Rob about the targetted approach they are taking for gene discovery and the study of gene expression. The Cold Spring Harbor Group in collaboration with colleagues at Applied Biosystems and the Genome Sequencing Center at Washington University have begun by sequencing two areas on the short arm of chromosome IV. They have continued sequencing on this chromosome and have also extended their efforts to chromosome V. While the initial sequencing was cosmid based they are primarily sequencing from bacterial artificial chromosome (BAC) based clones at the present time. The sequencing strategy uses M13 shotgun sequencing and ABI automated sequencers. The intention is to complete 6.5-7 million bases by September of 1999. They have already sequenced in excess of 250 kilobases from these regions. Sequence is made available to the community via the World Wide Web as it is generated, with no delay prior to public availability.

Rob the described how they have developed an efficient gene and enhancer trap transposon mutagenesis system in Arabidopsis using the maize elements Ac/Ds and the reporter gene GUS. More than 10,000 insertions have been generated at random around the genome, and PCR amplification and sequencing of a few hundred insertion sites has revealed insertion into a wide variety of genes. Gene traps that lead to reporter gene expression are inserted in the correct orientation within the transcription unit, while enhancer traps are often found outside the transcribed region. Gene disruptions in several known genes, such as HOOKLESS1 and AGL8 give rise to mutant phenotypes, and result in expected patterns of reporter gene expression. By sequencing the insertion sites of many thousands of elements, they hope to develop a dense map of transposons whose locations are precisely known by comparison with the genomic sequence. Because of the single copy nature of these insertions, they can be used as genetic landmarks to tie the sequence to the genetic map. They can also be used for local saturation mutagenesis, and Rob's group has demonstrated this possibility around the prolifera locus on chromosome 4.

Targeted insertion mutagenesis with the En-I transposon system in Arabidopsis.

Andy Pereira
Centre for Plant Breeding and Reproduction Research (CPRO-DLO), PO Box 16, 6700AA Wageningen, The Netherlands

Andy Pereira described how the maize Enhancer-Inhibitor (En-I or Spm-dSpm) transposable element system introduced into Arabidopsis displays continuous and frequent transposition. The transposon system consists of an En-transposase source under control of the CaMV35S promoter, which mediates `constitutive' transposition of the mobile I-elements throughout plant development and often leads to transposon multiplication. About 20 tagged mutants and 5 cloned genes have been isolated by a random tagging strategy. Andy also described how targeted tagging with linked I elements has been successful in the recovery of tagged Cer6 and Clv1 mutants. The numerous tagged mutants together with a resource of about 30 mapped I elements, now facilitate targeted tagging with linked transposons of specific genes with a distinguishable mutant phenotype. To produce a `mutation machine' in Arabidopsis and recover inserts in any gene of interest, this linked transposition behaviour was utilized to produce plant populations saturated for independent transpositions in specific chromosomal regions. Plant populations, with an I element transposing on chromosome 4, were organized in pools and the extracted DNA used for PCR screening with primers derived from sequenced genes in chromosome 4. Many inserts were found in genes identified by the ESSA chromosome 4 sequenced region. Target genes at different distances from the mapped I element, are being screened to estimate mutagenesis frequencies. For a `targeted gene inactivation' strategy of the whole genome, populations containing multiple (20-40 per plant) independent I element insertions, are produced which can be used for PCR screening of inserts in any gene.