Blogger Posts Searcher using Google Data .NET/Java Client APIs

It just happened today that I wanted to know if I had already published a post with a given title in one of the blogs I publish: http://jes4us.blogspot.com. During translation (I translate the posts from English to Portuguese) I had a feeling that I had  already worked on a similar text… well, it turns out I was mistaken!

Instead of going through the extensive list of posts looking one by one I thought why not leverage the power of Google Data API? You may say: why not do a simple Google search instead? Good point. As I like to play with code I couldn’t resist.

So here it is. A simple and faster way of knowing if I have a post with a given title. Bellow you’ll find the codez to both the .NET client API and the Java one.

Blogger Data API for .NET
1 - Download the client library here: http://code.google.com/p/google-gdata/downloads/list

2 - Install the .msi package Google_Data_API_Setup_1.9.0.0.msi.

3 - Create a new Console project and reference the DLL Google.GData.Client that’s in this folder: C:\Google Data API SDK\Redist

using System;
using System.Linq;
using Google.GData.Client;

namespace BlogPostsSearcher
{
    class Program
    {
        static void Main(string[] args)
        {
            Service bloggeService = AcquireService();

            AtomFeed feed = AcquireAndSetupFeed(bloggeService);

            // Search posts that contain the word "StringToSearchFor" in their titles
            var query = feed.Entries.Where(p => p.Title.Text.Contains("StringToSearchFor");

            // Writes the Blog's Title
            Console.WriteLine(feed.Title.Text);

            // Prints each post found...
            foreach (AtomEntry entry in query)
            {
                Console.WriteLine(string.Format("Post Title: {0} - Date Published: {1}", entry.Title.Text, entry.Published.ToShortDateString()));
            }

        }

        private static AtomFeed AcquireAndSetupFeed(Service service)
        {
            FeedQuery blogFeedUri = new FeedQuery("http://www.blogger.com/feeds/" + YourBlogID + "/posts/default");

            // Setting the number of posts to retrieve
            blogFeedUri.NumberToRetrieve = 1000;

            AtomFeed feed = service.Query(blogFeedUri);
            
            return feed;
        }

        private static Service AcquireService()
        {
            Service service = new Service("blogger", "YourCompanyName-BloggerPostsSearcher");

            service.Credentials = new GDataCredentials("YourEmailAddress@gmail.com", "YourPassword");

            GDataGAuthRequestFactory factory = (GDataGAuthRequestFactory)service.RequestFactory;
            
            return service;
        }
    }
}

Blogger Data API for Java
1 - Download the client library here: http://code.google.com/p/gdata-java-client/downloads/list

2 - Unzip the file http://code.google.com/p/gdata-java-client/downloads/detail?name=gdata-src.java-1.46.0.zip

3 - Create a new Java Project and add references to:
- gdata-client-1.0.jar that’s in this path: gdata/java/lib/
- google-collect-1.0-rc1
that’s in this path: gdata/java/deps/

import java.io.IOException;
import java.net.URL;
import java.util.List;

import com.google.gdata.client.GoogleService;
import com.google.gdata.data.Entry;
import com.google.gdata.data.Feed;
import com.google.gdata.util.AuthenticationException;
import com.google.gdata.util.ServiceException;

/**
 * @author Leniel Macaferi
 * @date 11-21-2011
 */
public class BloggerClient
{ public static void main(String[] args) throws IOException, ServiceException { try { GoogleService bloggerService = new GoogleService("blogger", "YourCompanyName-BloggerPostsSearcher"); bloggerService.setUserCredentials("YourEmailAddress@gmail.com", "YourPassword"); searchPosts(bloggerService, "YourBlogID", "StringToSearchFor"); } catch (AuthenticationException e) { // TODO Auto-generated catch block e.printStackTrace(); } } public static void searchPosts(GoogleService myService, String blogId, String search) throws ServiceException, IOException { // Request the feed URL feedUrl = new URL("http://www.blogger.com/feeds/" + blogId + "/posts/default"); Feed resultFeed = myService.getFeed(feedUrl, Feed.class); // Setting the number of posts to retrieve... resultFeed.setTotalResults(1000); List<Entry> posts = resultFeed.getEntries(); // Print the results System.out.println(resultFeed.getTitle().getPlainText()); for (Entry post : posts) { if(post.getTitle().getPlainText().contains(search)) { System.out.println("\t" + post.getTitle().getPlainText()); } } System.out.println(); } }

In the code above you need to replace accordingly the following parts:

- YourEmailAddress
- YourPassword
- YourBlogID

References
Blogger Client Libraries and Sample Code

Blogger Developer's Guide: .NET

Blogger Developer's Guide: Java

RavenDB Embedded with Management Studio UI

Go directly to solution with no bla bla bla…

I’ve been playing with RavenDB (a NoSQL document-oriented database) in an ASP.NET MVC 4 project for the past week. One thing I tried to do was to access RavenDB Management Studio UI so that I could see what’s actually present within the document store. This is important because one needs to check if docs are really being inserted, related docs are being deleted, etc…

Given that I’m running the embedded version of RavenDB (RavenDB-Embedded.1.0.499 package installed via NuGet in Visual Studio 2010), I was stuck trying to access the management studio since there isn’t much documentation on this subject when it comes to the EmbeddableDocumentStore. After struggling with it for about an hour of Googling and try and error, I decided to post a question at StackOverflow: Running RavenDB as an EmbeddableDocumentStore and accessing RavenDB Management Studio. Then I took a break to have launch and took a nap. After that I got back here to try a different approach and it really does work. Of course this is only a way to achieve what I want. This may not be the best approach but it’s enough. Just follow theses steps:

1 - Grab RavenDB latest build here:
http://builds.hibernatingrhinos.com/downloadlatest/ravendb

2 - Extract the files to C:\RavenDB-Build-499

3 - Edit the .config file in C:\RavenDB-Build-499\Server\Raven.Server.exe.config to point to your embedded database:

<appSettings>
   
<add key="Raven/Port" value="8088"/>
   
<add key="Raven/DataDir" value="C:\MyProject\trunk\MyProject\
App_Data\Database"
/>
   
<add key="Raven/AnonymousAccess" value="Get"/>
</appSettings>

4 - Click the Start.cmd present in the root folder C:\RavenDB-Build-499\Start.cmd

The server status output window should appear while it starts:

RavenDB server status windowFigure 1 - RavenDB server status window

When the server finishes its starting process, the Silverlight Management UI should be automatically opened in your preferred browser.

RavenDB Management UI (Web UI)Figure 2 - RavenDB Management Studio UI (Web UI)

Now I can see my docs, indexes, etc… and I hope you can too! :D

Note to self
According to John Allers, one should be able to access the Management Studio without having to start the server manually. That’s fine and I had already tried that, but I could not get it working at first (some days ago). This has led me to try everything else today and my last resort was posting a question at StackOverflow. After trying once more the same procedure, that is, trying to access the management studio using the URL http://localhost:8080, I finally got it working! Go figure. One possibility is that I had another service running on port 8080 when I first attempted to access the UI. As Windows has restarted since then, that service (Hudson probably) that was running on port 8080 is stopped and now everything just works as expected.

Things to do:

1 - Instantiate your EmbeddableDocumentStore this way:

_documentStore = new EmbeddableDocumentStore
            {
                ConnectionStringName = "YourDbName",
                UseEmbeddedHttpServer = true
            };

2 - Copy Raven.Studio.xap present in C:\RavenDB-Build-499\Server\ folder to the root folder of your web project

3 - Run you your web app

4 - Access http://localhost:8080 and voila… everything SHOULD work out of the box.

5 - Select Default Database:

RavenDB Management Studio accessed without running the server manuallyFigure 3 - RavenDB Management Studio accessed without running the server manually

Resources
Embedding RavenDB into an ASP.NET MVC 3 Application

Tree Graph Ordered Traversal Level by Level in C#

Recently as part of a job interview process, I was asked to solve some programming problems. This post shows the solution for one of such problems.

Problem
The problem ( or could we call it an algorithm exercise? ) is this:

Consider a tree of integers. Knowing that its root node is 0, and given its adjacency list as a two dimensional array of integers, write a function that prints out the elements/nodes in order/level by level starting from the root. That is, the root is printed in the first line, elements that can be reached from the root by a path of distance 1 in the second line, elements reached by a path of distance 2 in the third line, and so forth. For example, given the following adjacency list (draw the tree for a better view):

0 => 1, 2, 3
1 => 0, 4
2 => 0
3 => 0, 5
4 => 1, 6
5 => 3
6 => 4

The program should print:

0
1 2 3
4 5
6

Little bit of theory
If you read about Tree in Graph theory, you’ll see that we can represent a tree using a graph because a tree is an undirected graph in which any two vertices are connected by exactly one simple path. In other words, any connected graph without cycles is a tree.

The tree in this problem isn’t a binary tree, it’s a n-ary tree.

Solution
With theory in mind, here goes my proposed solution…

I’m reusing some code from past posts. In special, the Graph, AdjacencyList, Node, NodeList and EdgeToNeighbor classes.

I use this method to fill a Graph with the Tree structure:

/// <summary>
/// Fills a graph with a given tree structure.
/// </summary>
/// <param name="graph"></param>
private static void FillGraphWithTreeStructure(Graph graph)
{
    // Vertexes
    graph.AddNode("0", null);
    graph.AddNode("1", null);
    graph.AddNode("2", null);
    graph.AddNode("3", null);
    graph.AddNode("4", null);
    graph.AddNode("5", null);
    graph.AddNode("6", null);

    // Edges
    graph.AddDirectedEdge("0", "1");
    graph.AddDirectedEdge("0", "2");
    graph.AddDirectedEdge("0", "3");

    graph.AddDirectedEdge("1", "4");

    graph.AddDirectedEdge("4", "6");

    graph.AddDirectedEdge("3", "5");

    /* This is the tree:
               
            0
          / | \
         1  2  3
        /       \
       4         5
      /
     6
             
        This is the expected output:
             
        Level 1 = 0
        Level 2 = 1 2 3
        Level 3 = 4 5
        Level 4 = 6

    */
}

This is the method that does the hard work:

/// <summary>
/// Performs an ordered level-by-level traversal in a n-ary tree from top-to-bottom and left-to-right.
/// Each tree level is written in a new line.
/// </summary> 
/// <param name="root">Tree's root node</param>
public static void LevelByLevelTraversal(Node root)
{
    // At any given time each queue will only have nodes that
    // belong to a level
    Queue<Node> queue1 = new Queue<Node>();
    Queue<Node> queue2 = new Queue<Node>();

    queue1.Enqueue(root);

    while (queue1.Count != 0 || queue2.Count != 0)
    {
        while (queue1.Count != 0)
        {
            Node u = queue1.Dequeue();

            Console.Write(u.Key);

            // Expanding u's neighbors in the queue
            foreach (EdgeToNeighbor edge in u.Neighbors)
            {
                queue2.Enqueue(edge.Neighbor);
            }
        }

        Console.WriteLine();

        while (queue2.Count != 0)
        {
            Node v = queue2.Dequeue();

            Console.Write(v.Key);

            // Expanding v's neighbors in the queue
            foreach (EdgeToNeighbor edge in v.Neighbors)
            {
                queue1.Enqueue(edge.Neighbor);
            }
        }

        Console.WriteLine();
    }
}

To spice things up I have implemented a Parallel version of the above method using a ConcurrentQueue:

/// <summary>
/// Performs an ordered level-by-level traversal in a n-ary tree from top-to-bottom and left-to-right in Parallel using a ConcurrentQueue.
/// Each tree level is written in a new line.
/// </summary> 
/// <param name="root">Tree's root node</param>
public static void LevelByLevelTraversalInParallel(Node root)
{
    // At any given time each queue will only have nodes that
    // belong to a level
    ConcurrentQueue<Node> queue1 = new ConcurrentQueue<Node>();
    ConcurrentQueue<Node> queue2 = new ConcurrentQueue<Node>();

    queue1.Enqueue(root);

    while (queue1.Count != 0 || queue2.Count != 0)
    {
        while (queue1.Count != 0)
        {
            Node u;
                    
            queue1.TryDequeue(out u);

            Console.Write(u.Key);

            // Expanding u's neighbors in the queue
            foreach (EdgeToNeighbor edge in u.Neighbors)
            {
                queue2.Enqueue(edge.Neighbor);
            }
        }

        Console.WriteLine();

        while (queue2.Count != 0)
        {
            Node v;
                    
            queue2.TryDequeue(out v);

            Console.Write(v.Key);

            // Expanding v's neighbors in the queue
            foreach (EdgeToNeighbor edge in v.Neighbors)
            {
                queue1.Enqueue(edge.Neighbor);
            }
        }

        Console.WriteLine();
    }
}

Now it’s time to measure the execution time using a StopWatch:

private static void Main(string[] args)
{
    Graph graph = new Graph();

    FillGraphWithTreeStructure(graph);

    Stopwatch stopWatch = new Stopwatch();

    stopWatch.Start();

    LevelByLevelTraversal(graph.Nodes["0"]);

    stopWatch.Stop();

    // Write time elapsed
    Console.WriteLine("Time elapsed: {0}", stopWatch.Elapsed);

    //Resetting the watch...
    stopWatch.Reset();

    stopWatch.Start();

    LevelByLevelTraversalInParallel(graph.Nodes["0"]);

    stopWatch.Stop();

    // Write time elapsed
    Console.WriteLine("Time elapsed: {0}", stopWatch.Elapsed);

    Console.ReadKey();
}

Now the results:

Sequential
0
1 2 3
4 5
6
Time elapsed: 00:00:00.0040340

Parallel
0
1 2 3
4 5
6
Time elapsed: 00:00:00.0020186

As you see, time is cut by a factor of 2. I currently have a Core 2 Duo processor in my Mac mini.

Hope you enjoy it and feel free to add your 2 cents to improve this code! Of course there are other ways of solving this very problem and I would like to see those other ways. Do you have any other better idea?

Download
You can get the Microsoft Visual Studio Console Application Project at:

https://sites.google.com/site/leniel/blog/TreeLevelTraversal.rar

To try out the code you can use the free Microsoft Visual C# 2010 Express Edition that you can get at: http://www.microsoft.com/visualstudio/en-us/products/2010-editions/visual-csharp-express