I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 636 posts at DZone. You can read more from them at their website. View Full User Profile

Yahoo! Query Language

05.26.2010
| 4817 views |
  • submit to reddit

The Yahoo! Query Language, or YQL, is an Api that builds an uniform approach for making requests to web services of any kind, like Google, Flickr, or Twitter.

YQL derives his name from SQL, and it has a very similar syntax. The goal of the YQL platform is to preserve developers from learning the Api of every different web service they're going to use: instead they can learn only the schema of a table, and query it like it was part of a relational database.

This level of abstraction can simplify the learning curve of many complex web services: this is the old idea of moving mutability of a program from the logic (accessing a web service with its own Api) to the data (the YQL queries and metadata). From the point of view of the programmer, it's the same principle of substituing a factory with a configurable container or multiple tests with a single one that acts on different data sets.

By no means YQL is meant as a transparent solution: although there is a cache for the requests, contacting a remote web service, or a mashup of remote web services, like it was a real relational database is a good recipe for a disaster (the YQL syntax is also not fully compliant to SQL). Instead, YQL is useful because of its reusage of already grasped concept and models (the relational ones), and it is trying to leverage part of a standard like ANSI SQL instead of inventing yet another syntax and Api.

The virtual tables available in the standard YQL platform are many - more than one hundred officially supported tables which ranges from Flickr to Yahoo! Answers and Search. If we count community tables, which have been set up by non-Yahoo! developers, the total grows to more than 700 tables, to include Google, Twitter and YouTube services.

Actually, anyone can set up an Open Data Table by providing the metadata needed to transform the Api into one or more virtual relational tables. In the words of the official documentation:

YQL contains an extensive list of built-in tables for you to use that cover a wide range of Yahoo! Web services and access to off-network data. Open Data Tables in YQL allow you to create and use your own table definitions, enabling YQL to bind to any data source through the SQL-like syntax and fetch data. Once created anyone can use these definitions in YQL. 

The fastest way to get started with YQL is to use the in-browser console. Within the console, you can try different examples of queries, or write your own and get an immediate result in XML or Json. Or you can list the tables and introspect them to see which kind of WHERE condtions or JOINs you can add.

If you don't want to use the console for some reason, you can use curl or even netcat and it won't get much more difficult. This is the beauty of a RESTful interface.

Examples

I've put together some query examples taken from the YQL documentation. They are linked to the console so you can follow them and tune the result to discover YQL's capabilities.

The first example in Dustin Whittle YQL talk here in Italy was searching Flickr for photos of Lolcats:

select * from flickr.photos.search where text="Cat" limit 10

Or we can perform some GeoIP location, discovering where an IP address comes from:

select * from pidgets.geoip where ip='128.100.100.128'

Obtain geographical data such as latitude and longitude of a city:

select * from geo.places where text="san francisco, ca"

Or we can do a nested query, like passing to Yahoo! Search (or Google if you prefer) the result of a different query, in this case from an RSS first result:

select * from search.web where query in (select title from rss where url="http://rss.news.yahoo.com/rss/topstories" | truncate(count=1)) limit 1

The RSS is from Yahoo! News, but you can parse any feed within YQL. Or parse an HTML page with XPath expressions (or with quicker regular expressions, but remember that HTML is not a regular language):

select * from html where url="http://finance.yahoo.com/q?s=yhoo" and
      xpath='//div[@id="yfi_headlines"]/div[2]/ul/li/a'

You're not limited to SELECT queries, you can execute INSERT, UPDATE or DELETE statements, although they usually require authentication. For example you can create a new tweet with YQL:

use 'http://www.yqlblog.net/samples/twitter.status.xml'; insert into twitter.status (status,username,password) values ("Playing with INSERT, UPDATE and DELETE in YQL", "twitterusername","twitterpassword")

In any case, the YQL console will be kind to you and will prepare also a REST query ready to use, like:

http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22http%3A%2F%2Ffinance.yahoo.com%2Fq%3Fs%3Dyhoo%22%20and%0A%20%20%20%20%20%20xpath%3D'%2F%2Fdiv%5B%40id%3D%22yfi_headlines%22%5D%2Fdiv%5B2%5D%2Ful%2Fli%2Fa'&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys

with the customizations like the format already included. This was a bit unreadable because of the complex query, but the original Lolcats example would be:

https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20flickr.photos.search%20where%20text%3D%22Cat%22%20limit%2010&diagnostics=true

where the only unreadable parts are the %20 that encode the spaces (of course you would do this automatically in an external application.).

Nearly every language and platform is capable of perform REST queries as they are simple HTTP request, with no encapsulation of data in complex envelopes like in SOAP's case. The response parsing is also very simple since it is likely that your language will have at least one library that understands either Json or XML. Even if the original web service uses SOAP, now you can access it with YQL via a REST, almost-SQL-based interface: web services have never been so simple and now you can almost make your washing machine tweet when it finishes its job.
Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)