Microsoft has apparently firmed up its plans for a DNA-based storage device that it expects to be commercially available within about three years.
The software giant originally unveiled its research intoDNA作为档案储存介质去年;它描述了该技术能够将数据量存储在“一个大数据中心压缩成几种糖多维数据集中的数据中心。或者互联网上的所有公开可访问的数据都滑入了鞋盒。雷竞技电脑网站
“这是DNA存储的承诺 - 一旦科学家能够扩展技术并克服一系列技术障碍,”该公司在2016年博客帖子中表示。
塔拉棕色摄影/华盛顿大学
Once endoded, the synthetic DNA is able to store 200MB of data in a space smaller thatn the tip of a pencil. Extrapolated, DNA storage could potentially store a data center's worth of information in a space the size of a shoebox.
Microsoft发言人拒绝评论其DNA存储研究的进展。
但在一篇文章中麻省理工学院技术评论Microsoft Research的合作伙伴建筑师Doug Carmean表示,该公司希望在三年内为我们的数据中心提供一定数量的DNA数据,以便至少进行精品申请。“雷竞技电脑网站
存储德vice was described by Carmean as about the size of a large, 1970s-era Xerox copier with a data write speed of only 400MBps -- something Carmean admitted needs to increase to 100MBps to compete with other archive storage mediums such as magnetic tape drives.
Natalya Yezhkova, a research director at IDC, said with the staggering rate at which digital data is growing, the necessity of a DNA-type storage medium will be critical in the next 10 to 15 years.
“目前,解决这一增长的唯一方法是增加数据优化技术的足迹,无论是压缩还是重复数据删除,”叶茨波卡说。“这些技术很大,并减轻一些数据增长,但在长期内,我们肯定需要别的东西。”
For example, some healthcare data must be stored for the life of a patient, and federal regulations for auditing and civil litigation purposes require some financial records to be stored for seven or more years.
并且,随着大数据分析的发展,更多的公司正在寻找从他们的销售和客户数据档案中剔除有用的营销信息的方法。
Then there's video, photograph and audio files, something every smart phone owner can create at their leisure and that's increasingly stored by cloud services.
Researchers with Microsoft and UW developed what they described as "a novel approach" to convert the long strings of ones and zeroes in digital data into the four basic building blocks of DNA sequences -- adenine, guanine, cytosine and thymine -- represented as As, Gs, Cs and Ts.
数字数据被分解成片,并通过将其作为大量的微小DNA分子合成,这可以脱水并保存用于长期储存。
To access the stored data, the researchers encode the equivalent of zip codes and street addresses into the DNA sequences. Polymerase Chain Reaction (PCR) techniques -- commonly used in molecular biology -- help them more easily identify the zip codes they are looking for.
Tara Brown Photography/University of Washington
UW副教授Luis Henrique Ceze,蓝色,研究科学家Lee Oniterick准备含有数字数据的DNA进行测序,这允许它们读取和检索原始文件。
脱氧核糖核酸has a theoretical limit of being able to store more than one exabyte per millimeter, which is eight orders of magnitude denser than magnetic tape. DNA-based storage also has the benefit of eternal relevance: As long as there is DNA-based life, there will be strong reasons to read and manipulate DNA, the researchers said in研究论文。
云服务和超越计算提供商不断寻求存储越来越繁琐的数据的新方法;据Yezhkova称,这就是DNA存储可能会看到其初始家庭的位置。云归档服务如亚马逊冰川或谷歌的云平台将可能会为存储介质的候选人,而不是当今最突出的技术的容量和寿命。
"It's a trade-off of speed versus the economics of storing massive amounts of data for 50 years or more that could be untouched," Yezhkova said.
“这也是亚马逊或谷歌也可以研究DNA存储,”她继续。“他们不一定是在谈论这个或公开。”
As promising as DNA storage appears to be, there are still issues that need to be solved before it can be a viable technology for the data center -- for example, compatibility with existing applications and hardware. But, if those issues could be solved, "it would have a tremendous impact," Yezhkova said.
Since 2005, the amount of electronic data has been doubling every two years, accordingthe Digital Universe是IDC正在进行的研究。
The study estimates that from 2005 to 2020, the amount of electronic data generated throughout the world will grow by a factor of 300, from 130 exabytes to 40,000 exabytes, or 40 trillion gigabytes, which is more than 5,200GB for every person on earth.
对于分析价值,仅探索了数字宇宙的一小部分。IDC估计到2020年,多达33%的数字宇宙将包含如果分析的情况可能是有价值的信息。
到2020年,云计算提供商将“触摸”数字宇宙中的近40%的信息 - 这意味着一个字节将在从发起人出发的旅程中的云中存储或处理。
去年,微软与华盛顿大学(UW)的研究人员表示,他们通过将200MB的合成DNA股线数据存储了世界纪录。
研究人员表示,令人印象深刻的部分达到200MB里程碑,这不仅仅是他们可以编码合成DNA的数据,然后解码,它也是他们能够将其存放的空间。
Once encoded, the data occupied a spot in a test tube "much smaller than the tip of a pencil," Carmean said at the time.
The DNA storage also has a half-life of 500 years, even in harsh conditions. The half-life of DNA -- just as with radioactive material -- determines its rate of decay.
Today's most popular storage mediums, magnetic tape, hard disk drives, optical discs and NAND flash storage all have limited lifespans, which max out anywhere from five years to several decades.
Meanwhile, the proportion of data in the digital universe that requires protection is growing faster than the digital universe itself, from less than a third in 2010 to more than 40% in 2020, according to IDC.
“通过像物联网和大数据分析等项目,数据量将继续增加并且需要存储。如何在该行业中讨论如何存储所有这些数据的问题,”叶正帕说。