Thursday, August 1, 2013

Amazon's CloudSearch: Implementation using C#.Net (dotnet)

Amazon's CloudSearch: Implementation using C#.Net (dotnet)

 After working with DTSearch for long time, we switched over to Amazon's Cloud service. The factor for shifting was just the auto scalability feature in Amazon's Cloud Search as we need to worry for scalability issues with DTSearch.

 What is Amazon CloudSearch?

Amazon CloudSearch is a fully-managed service in the AWS Cloud that makes it easy to set up, manage, and scale a search solution for your website or application. Amazon CloudSearch enables you to search large collections of data such as web pages, document files, forum posts, or product information. Built for high throughput and low latency, Amazon CloudSearch supports a rich set of features including free text search, faceted search, customizable relevance ranking, configurable search fields, text processing options, and near real-time indexing.

To use Amazon CloudSearch, you simply:
1. Create a search domain
2. Configure your search fields
3. Upload your data for indexing, and
4. Submit search requests from your website or application

You can get more details at : http://aws.amazon.com/cloudsearch/

You can use Amazon CloudSearch to index and search both structured data and plain text. Amazon CloudSearch supports full text search, searching within fields, prefix searches, Boolean searches, and faceting. You can get search results in JSON or XML, sort and filter results based on field values, and rank results alphabetically, numerically, or according to custom rank expressions.

To build a search solution with Amazon CloudSearch, you:

1. Create and configure a search domain. A search domain encapsulates your searchable data and the search instances that handle your search requests. You set up a separate domain for each different data set you want to search.

2. Upload the data you want to search to your domain. Amazon CloudSearch automatically indexes your data and deploys the search index to one or more search instances.

3. Search your domain. You send a search request to your domain's search endpoint as an HTTP/HTTPS
GET request.

You can use Amazon's CloudSearch developer guide to know more:
Http://docs.aws.amazon.com/cloudsearch/latest/developerguide/SvcIntro.html

Amazon does not support the .net implementation for CloudSearch in its .net SDK. Here is the c# code you can use to get Amazon Cloudsearch implemented to post and get search requests from search indexes

namespace AmazonCloudSearch
{
    /// <summary>
    /// Data Formats in Cloud Search
    /// </summary>
    public enum DataFormat
    {
        Json,Xml
    }

    public class CloudSearch
    {

        /// <summary>
        /// Used to Add/Update/Delete Document batch data to Amazon Cloud Search
        /// </summary>
        /// <param name="JsonFormat"></param>
        /// <param name="DocumentUri"></param>
        /// <param name="ApiVersion"></param>
        /// <returns>Response Status</returns>    
        public string PostDocuments(string JsonFormat, string DocumentUri, string ApiVersion, DataFormat format)
        {
            try
            {
                var request = (HttpWebRequest)
                WebRequest.Create(string.Format("http://{0}{1}batch", DocumentUri, ApiVersion));
                request.ProtocolVersion = HttpVersion.Version11;
                request.Method = "POST";
                byte[] postBytes = Encoding.UTF8.GetBytes(JsonFormat);              
                request.ContentLength = postBytes.Length;
                request.Accept = "application/json";
                request.ContentType = format == DataFormat.Xml ? "text/xml;charset=utf-8" : "application/json;charset=utf-8";
                var requestStream = request.GetRequestStream();
                requestStream.Write(postBytes, 0, postBytes.Length);
                requestStream.Close();
                HttpWebResponse response = null;
                response = (HttpWebResponse)request.GetResponse();
                var retVal = new StreamReader(stream: response.GetResponseStream()).ReadToEnd();
                var statusCode = response.StatusCode;
                return statusCode + retVal;
            }
            catch (WebException Ex)
            {
                throw Ex;
            }
            catch (Exception ex)
            {
                throw ex;
            }
        }


        /// <summary>
        /// Used to make search Request to Amazon Cloud Search
        /// </summary>
        /// <param name="SearchUri"></param>
        /// <param name="SearchQuery"></param>
        /// <param name="ReturnFields"></param>
        /// <param name="PageSize"></param>
        /// <param name="Start"></param>
        /// <returns>Response Status</returns>      
        public string SearchRquest(string SearchUri, string SearchQuery, string ReturnFields, int PageSize, int Start, DataFormat format)
        {
            try
            {
                string SearchUrl = SearchUri + SearchQuery + "&return-fields=" + ReturnFields + "&start=" + Start + "&size=" + PageSize;
                if (format == DataFormat.Xml)
                    SearchUrl = SearchUrl + "&results-type=xml";

                string responseFromServer = string.Empty;
                // Create a request for the URL.
                WebRequest request = WebRequest.Create(SearchUrl);
                // If required by the server, set the credentials.
                request.Credentials = CredentialCache.DefaultCredentials;
                request.ContentType = "application/json";
                // Get the response.
                WebResponse response = request.GetResponse();
                // Get the stream containing content returned by the server.
                Stream dataStream = response.GetResponseStream();
                // Open the stream using a StreamReader for easy access.
                StreamReader reader = new StreamReader(dataStream);
                // Read the content.
                responseFromServer = reader.ReadToEnd();
                // Clean up the streams and the response.
                reader.Close();
                response.Close();
                //returns response from the server
                return responseFromServer;
            }
            catch (WebException Ex)
            {
                throw Ex;
            }
            catch (Exception ex)
            {
                throw ex;
            }
        }
    }
}

Here is the code snippet we can use to use the above class in code:

AmazonCloudSearch.CloudSearch ObjCloudSearch = new AmazonCloudSearch.CloudSearch();
string strResponse = string.Empty;
if (IsSortByViews)
   strQuery = strQuery + "&rank=-vw";
else if (isSortByLatest)
   strQuery = strQuery + "&rank=-pid";
//Call Amazon Cloud Search API Service -- views and contentid are the indexed fields
strResponse = ObjCloudSearch.SearchRquest(configSection.SearchUri, strQuery, strReturnFields, p_intPageSize, p_intStart, AmazonCloudSearch.DataFormat.Json);


 You can create strQuery using the search criteria using the syntax as given in developer guide.

Monday, February 25, 2013

Windows Azure based new Bing Search API


Microsoft came up with the Bing Search few years back to compete with the Google search with an edge for the developers to create their own solutions based on it. Microsoft exposed its public search API to be easily used by the developer community around the globe.
The Bing Search API enables developers to embed and customize search results in applications or websites using XML or JSON. Add search functionality to a website, create unique consumer or enterprise apps, or develop new mash-ups. The Bing Search API gives you access to web, image, news, and video results, as well as related search and spelling suggestions. Microsoft came up with Bing Search API 2.0 earlier.
With introduction of Windows Azure, the bing Search API 2.0 is transitioned to new cloud-based data service, that is available via subscription based on usage requirements. The new cloud-based Bing Search API enables developer community to create their own applications using it to search, get and use the available data store provided by Bing. They can also analyze the data online using the new Service Explorer tool.

The new version of the Bing Search API includes:
•             Metered subscription of query limits.
•             HTTPS query URLs (sometimes called "endpoints") that provide results in either XML or JSON media formats.
•             Open Data Protocol (OData) support for easy consumption across multiple development systems.
•             Improved support for data types.
•             The ability to monetize applications in the Windows Azure Marketplace.
•             Access to fresher results and improved relevance.

For users of old Bing Search API 2.0, the migration documents are provided at Windows Azure Marketplace to move to new and fresh cloud based Bing Search API.

I was user of Old Bing Search API 2.0; we migrated to new cloud based search API after our product code broke suddenly. The changes were made at hours’ time. There are two issues I face, when we moved to new API. First, the code was written in .Net 2.0, so the new BingSearchContainer.cs  provided at Windows Azure Marketplace did not work as such as this file requires version 4.0.
 Following is the list of changes to be made to migrate to new bing search api.
1. Change your api service root url to https://api.datamarket.azure.com/Bing/Search/
2. Get the BING ACCOUNT KEY


Using new BingsearchContainer.cs as provided at Windows Azure marketplace:
internal class SearchBing : SearchWeb, IDisposable
 {
                Dictionary<SettingFields, string> _optionalParameters;
                List<Dictionary<SearchResultFields, string>> _newPage;
                const string SEARCHAPISERVICEURI = "https://api.datamarket.azure.com/Bing/Search/";
const string BINGACCOUNTKEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"; // Set your account key here in place
      
                internal Dictionary<SettingFields, string> OptionalParameters
                {
                                set
                                {
                                                _optionalParameters = value;
                                }
                }


       

                internal List<Dictionary<SearchResultFields, string>> NewPage
                {
                get
                {
                                return _newPage;
                }
                 }                            

                internal void Search(string _searchQuery, uint _offset)
                {

                                var _bingSearchAzure  =  new BingSearchContainer(new             Uri(SEARCHAPISERVICEURI));
_bingSearchAzure.Credentials = new System.Net.NetworkCredential(BINGACCOUNTKEY, BINGACCOUNTKEY);
                                string options = String.Empty;


            if (_optionalParameters != null)
                options = SetOptionalParameters();

var _searchResponse = _bingSearchAzure.Image(_searchQuery, null, null, "strict", null, null, options);
            _searchResponse = _searchResponse.AddQueryOption("$top", this._resultsPerPage);
            _searchResponse = _searchResponse.AddQueryOption("$skip", _offset);
           
           
            var result = _searchResponse.Execute();
            this.TotalResults = 1000;
            _newPage = new List<Dictionary<SearchResultFields, string>>();
            foreach(ImageResult _result  in result)
            {
Dictionary<SearchResultFields, string> _newResult = new Dictionary<SearchResultFields, string>();
                                _newResult.Add(SearchResultFields.ThumbnailURL, _result.Thumbnail.MediaUrl);
                                _newResult.Add(SearchResultFields.ThumbnailHeight, _result.Thumbnail.Height.ToString());
                                _newResult.Add(SearchResultFields.ThumbnailWidth, _result.Thumbnail.Width.ToString());
                _newResult.Add(SearchResultFields.MainImageURL, _result.MediaUrl);
                _newResult.Add(SearchResultFields.Source, _result.SourceUrl);
                                _newResult.Add(SearchResultFields.Title, _result.Title);
                                 _newResult.Add(SearchResultFields.Height, _result.Height.ToString());
                                _newResult.Add(SearchResultFields.Width, _result.Width.ToString());
                                _newResult.Add(SearchResultFields.Size, _result.FileSize.ToString());
                                _newPage.Add(_newResult);
                
            }
           
                }
               

                private string [] SetOptionalParameters()
                {
           
            _parametersArray = new string[_optionalParameters.Count];
            int _index = 0;
            foreach (KeyValuePair<SettingFields, string> _parameters in _optionalParameters)
            {
                                _parametersArray[_index] = _parameters.Key.ToString() + ":" + _parameters.Value;
                _index = _index + 1;
            }
            return _parametersArray ;

                 }
}



Using the new Bing Search API using webrequest (Workable solution for any framework):


internal class SearchBing : SearchWeb, IDisposable
 {

                Dictionary<SettingFields, string> _optionalParameters;
                List<Dictionary<SearchResultFields, string>> _newPage;
                const string SEARCHAPISERVICEURI = "https://api.datamarket.azure.com/Bing/Search/";
                const string BINGACCOUNTKEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"; // Set your       account key here in place
      
                internal Dictionary<SettingFields, string> OptionalParameters
              {
                  set
                {
                     _optionalParameters = value;
                }
             }


       

        internal List<Dictionary<SearchResultFields, string>> NewPage
        {
            get
            {
                return _newPage;
            }
        }                     

         internal void Search(string _searchQuery, uint _offset)
        {

string imageSearchQueryUrl = SEARCHAPISERVICEURI + "Image?Query=%27" + _searchQuery + "%27&$top=" + this._resultsPerPage + "&$skip=" + _offset + "&$Options=%27" + System.Web.HttpUtility.UrlEncode(SetOptionalParameters()) + "%27" ;
      
            XmlDocument xmlParseDom = new XmlDocument();
System.Net.NetworkCredential accountCredential = new System.Net.NetworkCredential(BINGACCOUNTKEY, BINGACCOUNTKEY);
            XmlUrlResolver resolver = new XmlUrlResolver();
            resolver.Credentials = accountCredential;

            xmlParseDom.XmlResolver = resolver;
            xmlParseDom.Load(imageSearchQueryUrl);
    
            XmlNamespaceManager namespaceMgr = new XmlNamespaceManager(xmlParseDom.NameTable);
            namespaceMgr.AddNamespace("atom","http://www.w3.org/2005/Atom");
            namespaceMgr.AddNamespace("m","http://schemas.microsoft.com/ado/2007/08/dataservices/metadata");
            namespaceMgr.AddNamespace("d","http://schemas.microsoft.com/ado/2007/08/dataservices");
           
string nextResultSet = xmlParseDom.SelectSingleNode("/atom:feed/atom:link[@rel='next']/@href", namespaceMgr).Value;

XmlNodeList imageresults =           xmlParseDom.SelectNodes("/atom:feed/atom:entry/atom:content/m:properties",namespaceMgr);

            this.TotalResults = 1000;
            _newPage = new List<Dictionary<SearchResultFields, string>>();
            foreach(XmlNode _result in imageresults)
            {
                     Dictionary<SearchResultFields, string> _newResult = new Dictionary<SearchResultFields, string>();

                     XmlNodeList thumbnailNodelst = _result.LastChild.ChildNodes;
                     foreach (XmlNode xmlthumbnailNode in thumbnailNodelst)
                     {
                         if (xmlthumbnailNode.Name == "d:MediaUrl")
                            _newResult.Add(SearchResultFields.ThumbnailURL, xmlthumbnailNode.ChildNodes[0].InnerText);
                         if (xmlthumbnailNode.Name == "d:Width")
                            _newResult.Add(SearchResultFields.ThumbnailWidth, xmlthumbnailNode.ChildNodes[0].InnerText);
                         if (xmlthumbnailNode.Name == "d:Height")
                            _newResult.Add(SearchResultFields.ThumbnailHeight, xmlthumbnailNode.ChildNodes[0].InnerText);
                     }
_newResult.Add(SearchResultFields.MainImageURL, _result.SelectSingleNode(".//d:MediaUrl", namespaceMgr).InnerText);
_newResult.Add(SearchResultFields.Source, _result.SelectSingleNode(".//d:SourceUrl", namespaceMgr).InnerText);
                                _newResult.Add(SearchResultFields.Title, _result.SelectSingleNode(".//d:Title", namespaceMgr).InnerText);
_newResult.Add(SearchResultFields.Height, _result.SelectSingleNode(".//d:Height", namespaceMgr).InnerText);
_newResult.Add(SearchResultFields.Width, _result.SelectSingleNode(".//d:Width", namespaceMgr).InnerText);
_newResult.Add(SearchResultFields.Size, _result.SelectSingleNode(".//d:FileSize", namespaceMgr).InnerText);

                                _newPage.Add(_newResult);
                 }
           
}
               

private string SetOptionalParameters()
{
            string  _parametersString = String.Empty;
            int _index = 0;
            foreach (KeyValuePair<SettingFields, string> _parameters in _optionalParameters)
            {
                if (_index > 0)
                    _parametersString = _parametersString + "+";
                _parametersString = _parametersString + _parameters.Key.ToString() + ":" + _parameters.Value;
                _index = _index + 1;
            }
            return _parametersString;

   }
}