Coolthing Of Theday

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Monday, 18 November 2013

GQL, no, not the Gnome Query Language, the Genome Query Language

Posted on 17:23 by Unknown

Microsoft Research Connections Blog - Interactive genomics: querying genomes across the cloud

Big data: you can hardly pick up a newspaper without reading about some new scientific or business acumen derived from mining some heretofore-untouched volumes of digital information. Well, I’m happy to say that genome sequence data—which certainly qualifies as big, both in volume and velocity—is joining the party, and in a most meaningful way. When combined with information from medical records, genome data can be mined for new insights into treating disease.

...

Towards this vision, I have been working with researchers at University of California San Diego (UCSD) and have invented the Genome Query Language (GQL), which features three operators that allow error-resilient manipulation of genome intervals. This, in turn, abstracts a variety of existing genomic software tasks, such as variant calling (determining whether a person has a different gene from the reference) and haplotyping (ascribing genomic variation as being inherited from the mother or the father). GQL is inspired by the classic database query language SQL and has similar operators; however, GQL introduces a major new operator: the fault-tolerant union of genomic intervals.

...

To understand how GQL could be used on the Windows Azure platform in the cloud, imagine that a biologist is working on the ApoE gene, which is responsible for forming lipoproteins in the body. Wondering how ApoE gene variations affect cardiovascular disease (CV), the biologist types in a query with the parameters “ApoE, CV” on a tablet computer, just as you might enter a search-engine query. The query is sent to the GQL implementation in the cloud, which returns the ApoE region of the genome in patients with cardiovascular disease. Since the ApoE gene is quite small, the data is processed quickly in the cloud and returned in seconds to the biologist’s tablet. The biologist can then use customized bioinformatics software to mine the data to identify variations.

We have implemented GQL on Windows Azure and used it to query genomic data expeditiously. We have shown, for example, how GQL can be used to query The Cancer Genome Atlas for large structural variations by using only 5 to 10 lines of high-level code. The code took approximately 60 seconds to execute on the Windows Azure application in the cloud when run on an input human genome file of 83 gigabytes. GQL can improve existing software as well by refactoring queries, significantly speeding up results. It could also be used to facilitate browsing by queries and not just location within the UCSC genome browser.

To make the GQL implementation provide interactive speeds, two optimizations were crucial: cached parsing and lazy joins. Combined, they sped up query processing by a factor of 100. I encourage interested readers to explore the details of our research—the GQL queries we used, the optimizations we implemented, and the experimental results we achieved—in the Microsoft Research Technical Report: Interactive Genomics: Rapidly Querying Genomes in the Cloud.

UCSC Genome Bioinformatics Site

Welcome to the UCSC Genome Browser website. This site contains the reference sequence and working draft assemblies for a large collection of genomes. It also provides portals to the ENCODE and Neandertal projects.

We encourage you to explore these sequences with our tools. The Genome Browser zooms and scrolls over chromosomes, showing the work of annotators worldwide. The Gene Sorter shows expression, homology and other information on groups of genes that can be related in many ways. Blat quickly maps your sequence to the genome. The Table Browser provides convenient access to the underlying database. VisiGene lets you browse through a large collection of in situ mouse and frog images to examine expression patterns. Genome Graphs allows you to upload and display genome-wide data sets.

...

UCSC Genome Browser on Human Feb. 2009 (GRCh37/hg19) Assembly

.image

I so have no idea what to do with this, but I still think it's cool as heck. There's got to be a way I can work this into a Zombie novel or CSI kind of show... :P

Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in Azure, Science | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Mr. 7,000! This is my 7,000th post...
    Before this post; After; 20 visits between taking these snaps? Oh wait, that's probably me searching for past related posts....
  • "Windows Server Essentials Media Pack" (DNLA Stream, HTML5 and Dashboard Media stuff)
    Microsoft Downloads - Windows Server Essentials Media Pack This pack enables the media streaming functionality for Windows Server 2012...
  • Rad Gate Post... Get your Red Gate Post here...
    simple talk - Melanie Townsend - Get a copy of the Red Gate Post We recently put together a newspaper of some of the best articles fr...
  • Windows Management Framework 4.0 (PowerShell 4, PowerShell ISE, Management OData, WMI, etc.) now available
    Keith Hill's Blog - PowerShell 4.0 Now Available You can get PowerShell 4.0 for down level operating systems now via the WMF 4.0 d...
  • Viasfora - Your new favorite Visual Studio Text/*ML Editing Extension?
    Winterdom - Introducing Viasfora A couple of days ago, I unveiled Viasfora , my latest attempt at building a decently packaged extensi...
  • "Windows Server [2012 R2]: The Best Infrastructure to Run Linux Workloads"
    In the Cloud - What’s New in 2012 R2: Enabling Open Source Software Part 4 of a 9-part series . ... There are a lot of great s...
  • [Hardware Review] Life with Haswell... Haswell/Harris Beach Intel SDS Ultrabook Review - Part 2
    "So Greg, how's life with Haswell been?" "Pretty Sweet! (Mostly)" I've been given an opportunity to review t...
  • Fuzzy Lookup Add-In for Excel (Insert lame "Fuzzy, wuzzy was an Excel..." snip here)
    Microsoft Downloads - Fuzzy Lookup Add-In for Excel The Fuzzy Lookup Add-In for Excel performs fuzzy matching of textual data in Exce...
  • Caliburn.Micro v1.5.0 released (CM gets Tasks, Async/Await and Share/Setting for RT... and bug fixes of course)
    Caliburn.Micro - Caliburn.Micro v1.5.0 "Release Notes This release fixes many bugs. It also adds support for Task and async/a...
  • Just about everything you ever wanted to know about SQL Server Date and Time Data Types...
    CodeProject - Date and Time Data Types and Functions - SQL Server (2000, 2005, 2008, 2008 R2, 2012) Introduction It would be bette...

Categories

  • .Net
  • 3DPrinting
  • AFeedYouShouldRead
  • Agile
  • ALM
  • Amazon
  • Amiga
  • Analytics
  • Android
  • ASP.NET
  • Azure
  • BigData
  • bing
  • Blogging
  • Book
  • BookReview
  • BUILD
  • C
  • C#
  • C++
  • Career
  • Cat
  • cheatsheet
  • ClickOnce
  • Cloud
  • ComputerHardware
  • css
  • Data
  • DBA
  • DependencyInjection
  • Deployment
  • Design
  • Development
  • devops
  • DVCS
  • ebook
  • EDD
  • Education
  • EnterpriseLibrary
  • EntityFramework
  • Exchange
  • Expression
  • gadget
  • Game
  • GIT
  • Google
  • Government
  • Hadoop
  • hardware
  • HardwareReview
  • HaswellReview
  • HTML5
  • Humor
  • IE
  • IEExtension
  • IfAllElseFails
  • IIS
  • ILMerge
  • Image
  • Infographic
  • interview
  • InversionOfControl
  • Java
  • Javascript
  • Kinect
  • LightSwitch
  • LINQ
  • Linux
  • LosAngeles
  • Lucene
  • Lync
  • MEF
  • Metro
  • MicrosoftOffice
  • MicrosoftOutlook
  • Mono
  • MVC
  • MVVM
  • NetMon
  • NLP
  • NoSQL
  • NuGet
  • OData
  • OneNote
  • OpenXML
  • Paint.Net
  • Personal
  • Photosynth
  • Physics
  • portable
  • Poster
  • PowerShell
  • Preparedness
  • Presentation
  • Prism
  • PrivateCloud
  • RegEx
  • RemoteDesktop
  • Reporting
  • RIAServices
  • Science
  • ScienceFiction
  • Scratch
  • Scrum
  • ServiceBus
  • SharePoint
  • Silverlight
  • SimiValley
  • SPA
  • Space
  • SQLServer
  • Storyboard
  • Surface
  • SVG
  • SystemAdministration
  • T4
  • TeamBuild
  • TeamFoundationServer
  • TechEd
  • Training
  • TypeScript
  • UnitTesting
  • UnityApplicationBlock
  • Utility
  • Veteran
  • VirtualMachine
  • Visio
  • VisualBasic
  • VisualStudio
  • WCF
  • Web X.X
  • Webcast
  • WebFeed
  • WebMatrix
  • Windows
  • Windows7
  • Windows8
  • Windows8.1
  • WindowsHomeServer
  • WindowsLiveWriter
  • WindowsPhone
  • WindowsServer
  • WinRT
  • WiX
  • WMI
  • WOPI
  • WPF
  • XAML
  • XBox360
  • XboxOne
  • zombie

Blog Archive

  • ▼  2013 (500)
    • ►  December (12)
    • ▼  November (61)
      • Red Gate SSMS Ecosystem - The free SQL Server Mana...
      • Two SQL Server Resources that you might want to ta...
      • [Kickstarter of the Day]Making your paper airplane...
      • Infographic for the English Grammar Challenged...
      • Taking File Explorer to 11! (Okay 10... ) - 10 Tip...
      • Creating Per-Monitor DPI-Aware Applications Code S...
      • Windows 8.1 Store App Paper Prototyping Templates
      • Having your Amiga and Pi to... Using Raspberry Pi ...
      • [Humor] Death by Cube...
      • Happy VM Day! The Visual Studio 2013 RTM ALM Virtu...
      • Being smart when opening SkyDrive Smart Files in D...
      • If you're not cheating, you're not... using these ...
      • No If's here.. Jason Haley Web Appifys his Interes...
      • Viasfora - Your new favorite Visual Studio Text/*M...
      • Battling the Console Wars, one mini-3d print job a...
      • [Book Review] 'LÖVE for Lua Game Programming'
      • The TMI Infographic of the day... aka the world's ...
      • Windows Azure Learning Resource Link Round-up
      • Opening EntLib - The Microsoft Enterprise Library ...
      • Office/Exchange File Format,Specification and Prot...
      • Pst... Storing PST's on a network share? Still a n...
      • cough... cough... feeling a little [Xbox One deliv...
      • Surface Pro / Pro 2 Battery Life Extension Tip... ...
      • 1st Annual Socaltech 50 - 50 SoCal Tech companies ...
      • A test lab to play with eDiscovery across Exchange...
      • A word or two or 10 about Word Clouds
      • "Developer's Guide to Microsoft Enterprise Library...
      • New Amiga motherboard, updated AmigaOS... Who says...
      • 10 Professionals, 10 views on the coming trends in...
      • GQL, no, not the Gnome Query Language, the Genome ...
      • Working Office Remote (No that kind of remote...)....
      • [Book Review - Preview] 'LÖVE for Lua Game Program...
      • Nokia Imaging SDK v1.0 RTW now available, official...
      • New from NirSoft - WhoIsConnected[to my darn netwo...
      • Kodu Game Lab v1.4.1.0
      • "Community [e]Book of PowerShell Practices" Septem...
      • Katniss the cat in the kitty remake of The Hunger ...
      • Get your own California Ghost Town, Craigslist'd f...
      • VS2013 powers up with the new and updated Producti...
      • Apple II DOS source code available. All you have t...
      • Jumping into SQL Server 2014 with these two Micros...
      • Amazon goes virtual...desktop.. with their new Ama...
      • Missed Today's Visual Studio 2013 Launch (and ther...
      • "The Field Guide to Data Science" Free eBook of th...
      • Wax poetic with this new WiX Setup Project Editor,...
      • Whoa there's allot of the free NOAA [resources]
      • It's Log Parser Day! Robert Sheldon shows how Log ...
      • Visual Studio / Team Explorer 2013 no longer requi...
      • Preparing Patriotic Presentations with PowerPoint ...
      • Thank you for your thank you's, a veteran's day note
      • Don't Present, Resonate - Nancy Duarte's resonate ...
      • Dave McKinstry's Massive Missive of More VS 2013 L...
      • Drawing for Developers (Yes, you can!)
      • So tasty you'll want to eat your phone, Marmalade ...
      • We're from MSDN Magazine and we're here to help......
      • These beautiful D&D maps make you want to break ou...
      • Write a killer job description so you don't have t...
      • In with the new for the old school, the NNTP bridg...
      • "Windows 8.1 Quick [keyword] Guide for Business(?)"
      • Cloud VDI seems to be the new Dev desktop shiny, s...
      • "hackathon-in-a-box.org" is you're, well, Hackatho...
    • ►  October (65)
    • ►  September (38)
    • ►  August (47)
    • ►  July (75)
    • ►  June (39)
    • ►  May (40)
    • ►  April (42)
    • ►  March (39)
    • ►  February (42)
Powered by Blogger.

About Me

Unknown
View my complete profile