Build a Site Search with Yahoo! Search Web Services

Published in Application Programming Interfaces on Wednesday, November 9th, 2005

Looking at Yahoo!s APIs, you can see a little of that "openess" that Yahoo! CEO Terry Semel referred to at 2005's Web 2.0 conference. From images to movies to maps and search, they offer a lot of data through their APIs.

Note: The script available at the end of the article was updated on 2005-11-16 due to a small undefined index.

Yahoo! Web Services

Yahoo! provides APIs for many of the services that it offers, all housed under the Yahoo! Developer Network. These include Flikr, Maps, Music, Search, Search Marketing, Shopping (including Shopping User Product Reviews), Travel and Konfabulator Widgets. They also provide the ability to create customizable RSS feeds.

Yahoo! Search Web Services

This post will examine Yahoo! Search Web Services, which consists of an API that allows developers to integrate Yahoo!s search functionality into their web sites and applications.

The goal of this article will be to query the Yahoo! API server, and then to process the data once we have it. At the end of the article I have linked up a sample script which does just that.

Some details about the API

Yahoo!s search API is quite a bit more complete than that offered by Google (which we will be looking at in a later post). Not only can you make more daily queries (up to 5000), but you can get up to 100 results in one shot, you can request data from a start position (good for paginating results) and you can tighten your searches using some other advanced search parameters.

Terms of service

As with most APIs, there are restrictions that must be followed. For the example that will be rolled out here, an integrated site search, we are within the bounds of the API Terms of Service, but be aware that you cannot use the API for commercial use without permission from Yahoo!.

Before we get started

Yahoo! has a demo API key that they use in a few examples on their site, but if you are going to play around with the examples I would recommend that you head over and get an Application ID and the developer kit, which provides some examples in PHP penned by none other than Rasmus Lerdorf.

Lets get started - Step 1: Making a request

The Yahoo! Search Web Services are all REST services. In this case, this simply means that we need to make a GET request to their API server, passing it our API key along with any other parameters that outline the data we want to receive.

That request is structured according to the rules Yahoo! has set out, and the data returned is structured in a format that they have determined (in this case, XML).

Lets look at the base of a request URI:

http://api.search.yahoo.com/WebSearchService/V1/webSearch?

Starting from the hostname, they add the service name (WebSearchService) and version number followed by the method (webSearch) that will be used. From there we can add query parameters based on their rules for request parameters. Be aware that any values passed thru the URI must be url encoded.

Yahoo! provides the following example url that has parameters of an API key YahooDemo, searching for 2 results containing the term madonna:

http://api.search.yahoo.com/WebSearchService/V1/
webSearch?appid=YahooDemo&query=madonna&results=2

Building our request

Now lets look at building our own request. One very common way of accomplishing this is to make an associative array that contains the parameters that we want to use as keys, and the value for the parameter as values for those keys, as seen in lines 1 - 7 below.

Holding the base URI in another variable, we can pass them both to an URI building function that returns a complete URI for our purposes. See the code below for an example.

Note: We could expand on this by writing a function that builds the parameter array for us, grabbing values from GET variables passed thru our own search form.

For example, it could take site.com/search?p=mysql&num=20&filetype=pdf and, with the right coding, build the correct Yahoo! specific request. For this post, I am keeping it simple.

Given the code above, we can now easily build a request URI to query the Yahoo! Search API server.

So far so good? Cool. Now, lets get some data!

Step 2: Retrieving the data

Now that we can build a request for some data, we need a function that sends the request and fetches the file.

As we are coding in PHP, our approach will be to open the file with fopen(), and then fread() the data into a variable, as outlined below:

Aside: Deconstructing the response

Okay, we've managed to build a request, access the resulting file and read it into a variable. Now may be a good time to have a quick look at what the response was from the Yahoo! API server.

I've included the response to the madonna example query below so that we can have a good look at it:

ResultSet

Lets look at the opening tag, ResultSet. For our purposes, we want to pay attention to the last three attributes, totalResultsAvailable="3610652" totalResultsReturned="2" firstResultPosition="1". Those attributes are fairly self explanatory, and will be used when we present our results.

Result

The next section is a series of results, in this case two, each of which having a title, a summary, an url and a ClickUrl. Yahoo! likes you to use the ClickUrl when you use their results in a system, so that they can track the usage.

And that is about it. Not too complicated, no?

Step 3: Extracting the data

At this point we hit a bit of a crossroads. As we are using PHP in this example, we can do one of three things to parse the resulting XML document:

  1. We can use PHP's built in SAX parser
  2. We can use PHP's Document Object Model functions.
  3. We can use an external library to convert the document into an associative array.

Getting into either of numbers 1 or 2 would be a bit much for this article, in my opinion. It is already long enough, and I'm sure people would rather get to the meat and play with the API, so we are going to use an external library to unserialize the returned XML document into an array.

PHP XML libraries

There are a couple of libraries out there, for example minixml and Keith Devens' PHP XML library. For this article, I'll be using Keith's library, as it is a smaller, one file include (and open source).

It is worth noting that this library uses the SAX engine (#1 from above) to get it's work done, which consists of serializing XML from an array and vice versa.

XML => Array

This is quite simple with Keith's library. From Step 2 above, we already have our data held in a variable called $xml, so now all we have to do is pass it to the XML_unserialize($xml) function and, as seen in the example below, our data will be held in an array called $data (note that this example builds on the function from Step 2 above):

Here is a look at print_r($data) after running the madonna search thru the above code. As you will see, data for the search numbers is held in $data['ResultSet attr']. We can access our search results via $data['ResultSet']['Result']:

Presenting our results

Now we have our data, and simply need to process the array into the format or markup that we desire. This little bit I'm not going to cover here, though I do offer an example in the file available at the end of this document.

Lets pull it all together

Now that we have gone thru these explanations, lets look at what we need to do to build a site search feature with the Yahoo! Search API:

  1. A form for people to enter their search string.
  2. A function to take that search string and build a URI according to the Yahoo! specifications, so that we can request a set of search results. This was covered in Step 1 above.
  3. A set of functions to take the URI from above, open the file and retrieve the results, and then unserialize those results from XML to a PHP array. This was covered in Steps 2 and 3 above.
  4. A function to process the resulting array into HTML for display in a website.

Download some code!

Here is an example script that pulls this whole article together and accomplishes the list outlined above. When using, remember:

  1. To change the extension to php.
  2. That you need a server with php installed to run it.
  3. That you will need a Yahoo! Application Key to use it.

A short discussion

Obviously there are many more things that can be done with this Yahoo! API. You can add features into the sample script by simply adding them to the form as options which get dumped into the $params array, or more simply by adding them directly to that array in the code.

Please keep in mind that I haven't done any cleaning of the user input search string. If you do use this code and plan on echoing the search terms back to the user, be sure to clean the input first.

Some other possibilites exist as well. Obviously paginating the data is possible, and for some situations, like a site search, one may want to filter out home pages and other pages that may have new data on them since Yahoo! last crawled the site being queried.

An excellent example

Over at Using Wikipedia and the Yahoo API to give structure to flat lists, they have documented an interesting approach to cleaning up their data by using the Yahoo! API and a site specific search. Great stuff.

Disclaimer

This was my first crack at a longer technical post here on Fiftyfoureleven.com, so apologies if some things aren't very clear. Please feel free to ask away in the comments. Ditto if I've made an error somewhere!

I've already noticed some limitations of this new design, so I'm hoping to have a widescreen alternate stylesheet for code viewing ready for next week.

Next up

Next Week will see a double attack of the Google (Wednesday) and MSN (later) search APIs, after which we'll try and move into some other juicier offerings and also deal with request caching, among other things.

Comments and Feedback

Mmmmmm...Y! API's. Mike, this is a fantastic piece you've put together.
This is very long indeed. I'll let some of the engineers on the search team know that you've put this together.

Long, yeah.

Lots to stick in there, and so much more that could be done as well! I hope that the code file at the end at least makes it worthwhile :-) Be interesting to hear what the insiders have to think...

Great piece of work . Will surely try this out. Looking forward for next wednesday.

Thanks Kaushal, I hope you find it useful! I love that header image in Blue Horizon, btw. Very nice...

Thank You, I love playing with colours and images. I am here to rock Wordpres world you just wait and watch. Still learning but will get hang of it in 3-4 months :)

Thanks This is very long indeed. hope that the code file at the end at least makes it worthwhile. Will surely try this out.

Home » Blog » Web Development » Programming and Scripts » Application Programming Interfaces

Check out the blog categories for older content

The latest from my personal website,
Mike Papageorge.com

SiteUptime Web Site Monitoring Service

Sitepoint's web devlopment books have helped me out on many occasions both for finding a quick solution to a problem but also to level out my knowlegde in weaker areas (JavaScript, I'm looking at you!). I am recommending the following titles from my bookshelf:

The Principles Of Successful Freelancing

I started freelancing by diving in head first and getting on with it. Many years and a lot of experience later I was still able to take away some gems from this book, and there are plenty I wish I had thought of beforehand. If you are new to freelancing and have a lot of questions (or maybe don't know what questions to ask!) do yourself a favor and at least check out the sample chapters.

The Art & Science Of JavaScript

The author line-up for this book says it all. 7 excellent developers show you how to get your JavaScript coding up to speed with 7 chapters of great theory, code and examples. Metaprogramming with JavaScript (chapter 5 from Dan Webb) really helped me iron out some things I was missing about JavaScript. That said each chapter really helped me to develop my JavaScript skills beyond simple Ajax calls and html insertion with libs like JQuery.

The PHP Anthology: 101 Essential Tips, Tricks & Hacks

Like the other books listed here, this provides a great reference for the PHP developer looking to have the right answers from the right people at their fingertips. I tend to pull this off the shelf when I need to delve into new territory and usually find a workable solution to keep development moving. This only needs to happen once and you recoup the price of the book in time saved from having to develop the solution or find the right pattern for getting the job done..