Parallel LINQ (PLINQ) with Visual Studio 2010

On the last day of May I wrote about how to calculate prime numbers with LINQ in C#. To close that post I said that I’d use the PrimeNumbers delegate to evaluate PLINQ (Parallel LINQ) and measure the performance gains when the same calculation is done in parallel instead of in a sequential fashion.

As promised, today I show the performance gains when the same delegate is run in 2 cores (parallel) instead of only 1 core (sequential).

Here’s the delegate code:

Func<int, IEnumerable<int>> PrimeNumbers = max =>
from i in Enumerable.Range(2, max - 1)
where Enumerable.Range(2, i - 2).All(j => i % j != 0)
select i;  

To make it a candidate to parallelization we must just call the AsParallel() extension method on the data to enable parallelization for the query:

Func<int, IEnumerable<int>> PrimeNumbers = max =>
from i in Enumerable.Range(2, max - 1).AsParallel()
where Enumerable.Range(2, i - 2).All(j => i % j != 0)
select i;  

I set up a simple test to measure the time elapsed when using the two possible ways of calling the delegate function, that is, sequentially in one core and parallelized in my two available cores (I have an Intel Pentium Dual Core E2180 @ 2.00 GHz / 2.00 GHz).

Let’s calculate the prime numbers that are less than 50000 sequentially and in parallel:

IEnumerable<int> result = PrimeNumbers(50000);
Stopwatch  stopWatch = new Stopwatch();

stopWatch.Start();

foreach(int i in result)
{
    Console.WriteLine(i);
}

stopWatch.Stop();

// Write time elapsed
Console.WriteLine("Time elapsed: {0}", stopWatch.Elapsed);

Now the results:

1 core
Time elapsed: 00:00:06.0252929

2 cores
Time elapsed: 00:00:03.2988351

When running in parallel the result was great - almost half the time it took to run the app in a sequential fashion.

The whole work gets divided into two worker threads/tasks as shown in Figure 1:

Prime Numbers PLINQ Parallel Stacks Window
Figure 1 - The Parallel Stacks window in Visual Studio 2010

You can see that each thread is responsible for a range of values (data is partitioned among available cores). Thread 1 is evaluating the value 32983 and Thread 3 is evaluating 33073. This all occurs synchronously.

If I had a computer with 4 cores, the work would be divided into 4 threads/tasks and so on. If the time kept decreasing I’d achieve 1.5 seconds to run the app. Fantastic, isn’t it?

The new Microsoft Visual Studio 2010 (currently in Beta 2) comes with great debugging tooling for parallel applications as for example the Parallel Stacks shown in Figure 1 and the Parallel Tasks window shown in Figure 2:

Prime Numbers PLINQ Parallel Tasks Window
Figure 2 - The Parallel Tasks window in Visual Studio 2010

This post gives you a rapid view of PLINQ and how it can leverage the power of you current and future hardware.

The future as foreseen by hardware industry specialists is a multicore future. So why not get ready to it right now? You certainly can with PLINQ. It abstracts all the low level code to get parallel and let’s you focus on what’s important: your business domain.

References
Features new to parallel debugging in VS 2010
Debugging Task-Based Parallel Applications in Visual Studio 2010 by Daniel Moth and Stephen Toub

Great lecture on what to expect from the multicore and parallel future…
Slides from Parallelism Tour by Stephen Toub

PLINQ documentation on MSDN
http://msdn.microsoft.com/en-us/library/dd460688%28VS.100%29.aspx

Parallel Computing Center on MSDN
http://msdn.microsoft.com/en-us/concurrency/default.aspx

Daniel Moth’s blog
http://www.danielmoth.com/Blog/index.htm

Microsoft Visual Studio 2010
http://www.microsoft.com/visualstudio/en-us/products/2010/default.mspx

Finding missing numbers in a list using LINQ with C#

Let’s say you have a list of integer values that represent the days of a month like this:

6, 2, 4, 1, 9, 7, 3, 10, 15, 19, 11, 18, 13, 22, 24, 20, 27, 31, 25, 28

Clearly we have missing numbers/days in the list above. They are:

5 8 12 14 16 17 21 23 26 29 30

It’s really easy to get a list of missing numbers using LINQ with C# and the Except operator. LINQ is the greatest addition to the C# language. I can imagine how life would be difficult if we hadn’t LINQ!

This is how I implemented a missing numbers finder using a C# extension method:

public static class MyExtensions
{
    /// <summary>
    /// Finds the missing numbers in a list.
    /// </summary>
    /// <param name="list">List of numbers</param>
    /// <returns>Missing numbers</returns>
    public static IEnumerable<int> FindMissing(this List<int> list)
    {
        // Sorting the list
        list.Sort();

        // First number of the list
        var firstNumber = list.First();
// Last number of the list var lastNumber = list.Last(); // Range that contains all numbers in the interval
// [ firstNumber, lastNumber ]
var range = Enumerable.Range(firstNumber, lastNumber - firstNumber); // Getting the set difference var missingNumbers = range.Except(list); return missingNumbers; } }

Now you can call the extension method in the following way:

class Program
{
    static void Main(string[] args)
    {
        // List of numbers
        List<int> daysOfMonth =
            new List<int>() { 6, 2, 4, 1, 9, 7, 3, 10, 15, 19, 11, 18, 13, 22, 24, 20, 27, 31, 25, 28 };

        Console.Write("\nList of days: ");

        foreach(var num in daysOfMonth)
        {
            Console.Write("{0} ", num);
        }

        Console.Write("\n\nMissing days are: ");

        // Calling the Extension Method in the List of type int 
foreach(var number in daysOfMonth.FindMissing()) { Console.Write("{0} ", number); } } }

This is the output:

 Missing Numbers Finder output

In this simple program I’m using 3 concepts of the C# language that are really interesting: implicitly typed local variables, extension methods and collection initializers.

Hope this simple extension method to find the missing elements of a sequence helps the developers out there.

Visual Studio 2008 C# Console Application
You can get the Microsoft Visual Studio Project at:

http://leniel.googlepages.com/MissingNumbersFinder.zip

To try out the code you can use the free Microsoft Visual C# 2008 Express Edition that you can get at: http://www.microsoft.com/express/vcsharp/

NPOI with Excel Table and dynamic Chart

A reader of the blog called Zip wrote a comment on the post Creating Excel spreadsheets .XLS and .XLSX in C#.

This is an excerpt from Zip’s comment:

if I add rows using NPOI in C#, rows added under the table won't be automatically included in the table, and my chart is not updated the way I would like it to be.
How can I work around this problem?

I tried to simulate the problem with a simple spreadsheet and I was getting the same problem stated by Zip, that is, if I added one row just beneath the last row in the table, such added row wasn’t included in Excel’s data table and consequently the chart bound to the table wasn’t updated to reflect the new data.

To workaround this problem, let’s consider the following spreadsheet shown in Figure 1:

NPOI with Excel Table and dynamic Chart 
Figure 1 - NPOI with Excel Table and dynamic Chart

As you see we have a simple Excel data table with a players column that represents the name arguments of the chart, 4 columns for the months that form the category labels arguments (X axis) and the values arguments for the months going from Jan through Apr (Y axis).

Using NPOI to insert a new row in the table shown above we do the following:

// Creating a new row... 0 is the first row for NPOI.
HSSFRow row = sheet.CreateRow(5); // Row 6 in Excel
// Creating new cells in the row... 0 is the first column for NPOI.
row.CreateCell(1).SetCellValue("Eve Paradise"); // Column B
row.CreateCell(2).SetCellValue(4); // Column C
row.CreateCell(3).SetCellValue(3); // Column D
row.CreateCell(4).SetCellValue(2); // Column E
row.CreateCell(5).SetCellValue(1); // Column F 

The result is shown in Figure 2:

NPOI with Excel Table and dynamic Chart - Adding a new row
Figure 2 - NPOI with Excel Table and dynamic Chart - Adding a new row

Figure 2 shows us the problem stated by Zip in his comment. The new row we just added wasn’t included in the table. The chart that is linked to the table won’t update because it isn’t aware of the new row.

How to workaround this problem? That’s the question!

After playing with this case for 4 hours I’ve found a way of doing what Zip asks for.

Here’s how I did it:

Expand your Excel data table to row 10. I expanded only 4 rows just to show you how to workaround NPOI’s current limitation.

To expand your table, click in the minuscule handle in the lower-right corner of the cell occupying the lower-right corner of the table. This handle gives you a way to expand the table. Usually, it’s easier just to add data and let Excel expand the table - what doesn’t work with NPOI. But if you want to add several new rows or columns all at once, the handle is a good way to do it.

After expanding your table save the spreadsheet. It’ll be the template spreadsheet used to create new spreadsheets.

Figure 3 shows how the above spreadsheet looks like when the table is expanded to row 10:

NPOI with Excel Table and dynamic Chart - Expanding the Table
Figure 3 - NPOI with Excel Table and dynamic Chart - Expanding the Table

We can see that row 6 added using NPOI is now part of the table because we expanded the table. The chart now shows the new data but we got a new problem: the chart shows empty (blank series) that are the reflection of the the empty rows we have on the data table - take a look at the chart’s legend for example and you’ll see squares that represent nothing.

How to get over this? Well, we just need to filter the data in the table as shown in

Figure 4:

NPOI with Excel Table and dynamic Chart - Filtering Data (blank series)
Figure 4 - NPOI with Excel Table and dynamic Chart - Filtering Data (blank series)

Filter out players removing the blank rows by unchecking (Blanks) circled in red in Figure 4. Doing so the chart will reflect the change showing only the filtered data as you see in Figure 5:

NPOI with Excel Table and dynamic Chart - Filtered Data (no empty rows)
Figure 5 - NPOI with Excel Table and dynamic Chart - Filtered Data (no empty rows)

Now we have an Excel data table that is filtered (take a look at the funnel symbol) in the Player column. Other difference is that the rows that contain data are marked in blue. Although we have only 4 rows of data being displayed, our table has indeed 8 rows of data because we expanded it. The other 4 rows are hidden because they were filtered for not having any data yet.

Positioning the mouse cursor within the Excel data table, I’ll add a Total Row (option circled in red) in the table so that I can summarize data the way I want for each column as shown in Figure 6:

NPOI with Excel Table and dynamic Chart - Adding Total Row
Figure 6 - NPOI with Excel Table and dynamic Chart - Adding Total Row

With this Excel template spreadsheet we can now use NPOI to fill our sheet with more 4 rows of data. Let’s do it. This is the code I used:

HSSFRow row7 = sheet.CreateRow(6);

row7.CreateCell(1).SetCellValue("David Goliath");
row7.CreateCell(2).SetCellValue(7);
row7.CreateCell(3).SetCellValue(7);
row7.CreateCell(4).SetCellValue(7);
row7.CreateCell(5).SetCellValue(7);

HSSFRow row8 = sheet.CreateRow(7);

row8.CreateCell(2).SetCellValue("Moses of Egypt");
row8.CreateCell(3).SetCellValue(8);
row8.CreateCell(4).SetCellValue(8);
row8.CreateCell(5).SetCellValue(8);
row8.CreateCell(6).SetCellValue(8);

HSSFRow row9 = sheet.CreateRow(8);

row9.CreateCell(1).SetCellValue("David Shepherd");
row9.CreateCell(2).SetCellValue(9);
row9.CreateCell(3).SetCellValue(9);
row9.CreateCell(4).SetCellValue(9);
row9.CreateCell(5).SetCellValue(9);

HSSFRow row10 = sheet.CreateRow(9);

row10.CreateCell(2).SetCellValue("Jesus of Nazareth");
row10.CreateCell(3).SetCellValue(10);
row10.CreateCell(4).SetCellValue(10);
row10.CreateCell(5).SetCellValue(10);
row10.CreateCell(6).SetCellValue(10);
// Forcing formula recalculation so that the Total Row gets updated
sheet.ForceFormulaRecalculation = true;

After filling the spreadsheet we get the result shown in Figure 7:

NPOI with Excel Table and dynamic Chart - Chart updated automatically/dynamically
Figure 7 - NPOI with Excel Table and dynamic Chart - Chart updated automatically/dynamically

This is the workaround! :o)

The rows added with NPOI now are part of the table and are shown in the chart.

As a last hint: remember to expand your Excel data table to the number of rows you think your spreadsheet will store so that the rows added with NPOI get included in the table and the chart gets updated.

Again this is a good proof of what free software as is the case of NPOI can make for us. Even when dealing with more elaborated concepts as is the case of Excel tables and charts NPOI makes it easy to get the job done.

I wish that the next version of NPOI does what Zip wants automatically, that is, recognize rows added under the last row of an Excel table. At least we could have a parameter to let the user define if s/he wants the row to make part of the table or not.

Hope you enjoy this post.

Visual Studio 2008 C# ASP.NET MVC Web Application
You can get the Microsoft Visual Studio Project at:

http://leniel.googlepages.com/NPOIExcelTableChartMvcProject.zip

To try out the code you can use the free Microsoft Visual Web Developer 2008 Express Edition that you can get at: http://www.microsoft.com/express/vwd/Default.aspx