Pular para o conteúdo principal

DNA could store all of the world's data in one room

Humanity has a data storage problem: More data were created in the past 2 years than in all of preceding history. And that torrent of information may soon outstrip the ability of hard drives to capture it. Now, researchers report that they’ve come up with a new way to encode digital data in DNA to create the highest-density large-scale data storage scheme ever invented. Capable of storing 215 petabytes (215 million gigabytes) in a single gram of DNA, the system could, in principle, store every bit of datum ever recorded by humans in a container about the size and weight of a couple of pickup trucks. But whether the technology takes off may depend on its cost.
DNA has many advantages for storing digital data. It’s ultracompact, and it can last hundreds of thousands of years if kept in a cool, dry place. And as long as human societies are reading and writing DNA, they will be able to decode it. “DNA won’t degrade over time like cassette tapes and CDs, and it won’t become obsolete,” says Yaniv Erlich, a computer scientist at Columbia University. And unlike other high-density approaches, such as manipulating individual atoms on a surface, new technologies can write and read large amounts of DNA at a time, allowing it to be scaled up.
Scientists have been storing digital data in DNA since 2012. That was when Harvard University geneticists George Church, Sri Kosuri, and colleagues encoded a 52,000-word book in thousands of snippets of DNA, using strands of DNA’s four-letter alphabet of A, G, T, and C to encode the 0s and 1s of the digitized file. Their particular encoding scheme was relatively inefficient, however, and could store only 1.28 petabytes per gram of DNA. Other approaches have done better. But none has been able to store more than half of what researchers think DNA can actually handle, about 1.8 bits of data per nucleotide of DNA. (The number isn’t 2 bits because of rare, but inevitable, DNA writing and reading errors.)
Erlich thought he could get closer to that limit. So he and Dina Zielinski, an associate scientist at the New York Genome Center, looked at the algorithms that were being used to encode and decode the data. They started with six files, including a full computer operating system, a computer virus, an 1895 French film called Arrival of a Train at La Ciotat, and a 1948 study by information theorist Claude Shannon. They first converted the files into binary strings of 1s and 0s, compressed them into one master file, and then split the data into short strings of binary code. They devised an algorithm called a DNA fountain, which randomly packaged the strings into so-called droplets, to which they added extra tags to help reassemble them in the proper order later. In all, the researchers generated a digital list of 72,000 DNA strands, each 200 bases long.
They sent these as text files to Twist Bioscience, a San Francisco, California–based startup, which then synthesized the DNA strands. Two weeks later, Erlich and Zielinski received in the mail a vial with a speck of DNA encoding their files. To decode them, the pair used modern DNA sequencing technology. The sequences were fed into a computer, which translated the genetic code back into binary and used the tags to reassemble the six original files. The approach worked so well that the new files contained no errors, they report today in Science. They were also able to make a virtually unlimited number of error-free copies of their files through polymerase chain reaction, a standard DNA copying technique. What’s more, Erlich says, they were able to encode 1.6 bits of data per nucleotide, 60% better than any group had done before and 85% the theoretical limit.
“I love the work,” says Kosuri, who is now a biochemist at the University of California, Los Angeles. “I think this is essentially the definitive study that shows you can [store data in DNA] at scale.”
However, Kosuri and Erlich note the new approach isn’t ready for large-scale use yet. It cost $7000 to synthesize the 2 megabytes of data in the files, and another $2000 to read it. The cost is likely to come down over time, but it still has a long ways to go, Erlich says. And compared with other forms of data storage, writing and reading to DNA is relatively slow. So the new approach isn’t likely to fly if data are needed instantly, but it would be better suited for archival applications. Then again, who knows? Perhaps those giant Facebook and Amazon data centers will one day be replaced by a couple of pickup trucks of DNA.
Source: http://www.sciencemag.org/news/2017/03/dna-could-store-all-worlds-data-one-room
Postado por David Araripe

Comentários

Postagens mais visitadas deste blog

CONSERVAÇÃO DE ALIMENTOS E A EQUAÇÃO DE ARRHENIUS por Carlos Bravo Diaz, Universidade de Vigo, Espanha

Traduzido por Natanael F. França Rocha, Florianópolis, Brasil  A conservação de alimentos sempre foi uma das principais preocupações do ser humano. Conhecemos, já há bastante tempo, formas de armazenar cereais e também a utilização de azeite para evitar o contato do alimento com o oxigênio do ar e minimizar sua oxidação. Neste blog, podemos encontrar diversos ensaios sobre os métodos tradicionais de conservação de alimentos. Com o passar do tempo, os alimentos sofrem alterações que resultam em variações em diferentes parâmetros que vão definir sua "qualidade". Por exemplo, podem sofrer reações químicas (oxidação lipídica, Maillard, etc.) e bioquímicas (escurecimento enzimático, lipólise, etc.), microbianas (que podem ser úteis, por exemplo a fermentação, ou indesejáveis caso haja crescimento de agentes patogênicos) e por alterações físicas (coalescência, agregação, etc.). Vamos observar agora a tabela abaixo sobre a conservação de alimentos. Por que usamo

Two new proteins connected to plant development discovered by scientists

The discovery in the model plant Arabidopsis of two new proteins, RICE1 and RICE2, could lead to better ways to regulate plant structure and the ability to resist crop stresses such as drought, and ultimately to improve agricultural productivity, according to researchers at Texas A&M AgriLife Research. Credit: Graphic courtesy of Dr. Xiuren Zhang, Texas A&M AgriLife Research The discovery of two new proteins could lead to better ways to regulate plant structure and the ability to resist crop stresses such as drought, thus improving agriculture productivity, according to researchers at Texas A&M AgriLife Research. The two proteins, named RICE1 and RICE2, are described in the May issue of the journal eLife, based on the work of Dr. Xiuren Zhang, AgriLife Research biochemist in College Station. Zhang explained that DNA contains all the information needed to build a body, and molecules of RNA take that how-to information to the sites in the cell where they can be used

NIH completes atlas of human DNA differences that influence gene expression

Sections of the genome, known as expression Quantitative Trait Loci (eQTL) work to control how genes are turned off and on. Bethesda, Md. , Wed., Oct.11, 2017 - Researchers funded by the National Institutes of Health (NIH) have completed a detailed atlas documenting the stretches of human DNA that influence gene expression - a key way in which a person's genome gives rise to an observable trait, like hair color or disease risk. This atlas is a critical resource for the scientific community interested in how individual genomic variation leads to biological differences, like healthy and diseased states, across human tissues and cell types. The atlas is the culmination of work from the Genotype-Tissue Expression (GTEx) Consortium, established to catalog how genomic variation influences how genes are turned off and on. "GTEx was unique because its researchers explored how genomic variation affects the expression of genes in individual tissues, across many individuals, and