Monday, January 28, 2013

Follow-up for Connection Pooling...

I had mentioned in one of my recent posts that I was told we are not using connection pooling in our C# web service code. I had done some reading, and it seemed pretty clear that connection pooling is enabled by default in the .Net framework, and that you would need to explicitly disable pooling.

I looked through our code and I couldn't find any instance where we set pooling to false for our connection strings. I also wrote some simple code to test if the classes we are using (from Microsoft.Practices.EnterpriseLibrary.Data) might do something odd that would prevent connection pooling, or require us to explicitly state that we want to use connection pooling.

The result of my test code was that the code where I explicitly set "pooling=false" took a longer time to execute than the code without "pooling=false". That shouldn't be surprising, but I was told by multiple people that the code was not using connection pooling. It is understandable why people thought the code was not using pooling, though. For one, there sometimes is a feeling that as code gets old, then the quality of the design degrades. Also, there are lots of coding styles and myths that get passed around as fact. This has been a concrete example for me that everyone should try things out before just accepting something as a fact when it comes to writing software. In addition, it pays to read the documentation!

The test code looks like this:

private const string pooled =
    @"Data Source=someserver;Initial Catalog=dbname;Integrated Security=SSPI;";
private const string notpooled =
    @"Data Source=someserver;Initial Catalog=dbname;Integrated Security=SSPI;Pooling=false;";

public static void Main(string[] args)
{
    int testCount = 10000;
    
    GenericDatabase db = new GenericDatabase(pooled, SqlClientFactory.Instance);
    Console.WriteLine("    pooled : {0}", RunConnectionTest(db, testCount));

    db = new GenericDatabase(notpooled, SqlClientFactory.Instance);
    Console.WriteLine("not pooled : {0}", RunConnectionTest(db, testCount));
}

private static TimeSpan RunConnectionTest(GenericDatabase db, int testCount)
{
    Stopwatch sw = new Stopwatch();
    sw.Start();
    DbConnection conn;
    for (int i = 0; i < testCount; i++)
    {
        conn = db.CreateConnection();
        conn.Open();
        var schema = conn.GetSchema(); // just to make the code do something
        conn.Close();
    }
    sw.Stop();
    return sw.Elapsed;
}


I specifically did not use using statements for the connection. I wanted to mimic what the service code is doing.

The result of the test was that the loop using the connection string that implicitly uses connection pooling finished in just over 1 second. The loop for the connection string that explicitly says to not use connection pooling took about 30 seconds.

I'm pretty happy to know that we don't need to do any updates to our code to make it use connection pooling!

Saturday, January 26, 2013

Free networking class via Coursera.org...

I signed up for a class on coursera.org.  Well, that's not true - I signed up for 17 classes on coursera.org.  The classes are all free, and they are taught by professors from top notch universities.  I just started the "Introduction to Computer Networks" class.  It's already been incredibly interesting, and I'm very happy I signed up for it.

The class has already started, but if you don't really care about the grade you get out of the class, and you have an interest in learning how computer networks work, then don't hesitate to sign up. 

The lecture videos have been very straight forward so far, and the video length for each of the lecture segments are in easy to digest (short) segments. This lends itself to being easy to watch during a lunch break, or if you have 30 minutes or less available in the evenings. Most videos have been around 15 minutes or less, but I have found it useful to re-watch segments after realizing that I might have missed something important.

We covered information that explains how latency is computed and it made me realize that, even though it is not difficult to understand latency, there is more to latency than I thought. I have always generalized latency to meaning lag.  ie, playing some online game might be choppy for me due to network latency. That might be true, but now I understand what the latency is.

The way I understood latency explained is that latency is the transmission delay (amount of time it takes to put bits of data onto a "wire") plus the propagation delay (amount of time  for the bits to travel the distance from source host to the target host). 

L = latency, M = message length in bits, R = transmission rate, C = speed of light, and D = length / (2/3) C, then:

Transmission delay : M/R
Propagation delay : R = Length / (2/3) C

Therefore:

Latency : L = M/R + D

When you break it down in that way, then it seems pretty clear as to what might be causing high latency if you check out 1) where the target host is that you are connected to, 2) what your bandwidth is for your network connection, and 3) how much data you are transferring.

Something else we covered was the Nyquist Limit which tells us how much data can be sent given a certain bandwidth.  It was a fairly brief description, and the main thing I got out of it was how to use the formula.  :)  

The formula is :

Rate = 2 * Bandwidth * log(base 2)V bits per second

V is the number of signal levels that are being used.

Here is a sample question from the homework:

TV channels are 6 MHz wide. How much data can be sent per second, if four-level digital signals are used? Assume a noiseless channel, and use the Nyquist limit.

Fun stuff! Apparently the class will eventually include some java programming assignments, so I'm looking forward to those.



Wednesday, January 23, 2013

Connection pooling and using TCP instead of Named Pipes...

We are experiencing a very odd problem at work. The SQL Servers are needing to be failed over to the backup instances on a regular basis as a work-around for an issue where MS SQL is not appearing to accept connections, or the connections seem to take a while to establish. Microsoft support has been working with my employer since around November, and they seem to not be getting any closer to diagnosing and fixing the problem.

One of the potential symptoms of the issue is that we have intermittent long running web method calls.  The web methods are fairly simple - we make a sproc call, and then populate and return objects with the data.  The SQL DBAs have checked the performance of the sproc calls during the time that the web methods are performing slowly and have assured us that the sproc calls are returning in 200ms or less.

The CPU load on the load balanced web servers seem to be reasonable, so we have been speculating that the slow web method calls are either tied to the mysterious SQL Server connection issue, or due to some unknown issue while trying to connect to the database.

We decided to take a number of steps to try to improve performance.  If the issue is due to the MSC (mysterious SQL connection) issue, then our changes won't make much of a difference.  However, we might find that our code is either exacerbating the MSC issue, or the direct cause of the slow web method calls.  It's doubtful that our code is the sole cause of the MSC issue since multiple non-related SQL servers have experienced the MSC issue.  

Step 1: Add extra performance counters to try to pinpoint where the slow performance issue is occurring

We added performance counters around some of our database connection code, and around the sproc call, to see if there are any performance issues.  The tool we use to view performance counters was missing the new counters we added that would help show how often new connections are being made.  That should hopefully be fixed soon.    

Step 2: Change the connection from using named pipes to TCP

Is it better to use TCP than named pipes?  According to MSDN, the performance of using named pipes and TCP is virtually identical on fast networks.  However, if you are on a slow network, and the SQL Server database is running on a different machine than the application that will be connecting to it, then TCP could give you better performance.  The documentation made it sound as though named pipes might give better performance if the SQL Server instance was on the same machine as the app that is accessing the database.  That is not our configuration, so it appears that there is no benefit to using named pipes.  However, using TCP will be less "talky" (needs less communication back and forth to perform reads) than named pipes, and it can also help streamline communication by using backlog queues.

I hadn't realized that there might be a potential performance benefit of using TCP.  It shows me that I really should read through the documentation for technologies that I use regularly.  It will help me be a lot more thoughtful about how and why I am doing something.  Does using the default protocol (named pipes) work?  Yes. However, we also need to be mindful of the performance our services have when fetching data from our databases.

Step 3: Change the connections to be in a form that will allow connection pooling (if the app isn't already using connection pooling)

I'm pretty new to C# / .Net, so some of what I've been told by my teammates has been taken on faith.  However, I want to learn, so I did some reading.  It appears that the connection pooling should happen regardless of using SqlConnection or DbConnection.  I'll need to look closer at the code, and config files, to find out whether or not our app is disabling pooling, but it might be that we are using pooling, but the following is occurring due to the MSC issue. From an MSDN article on pooling:
When connection pooling is enabled, and if a timeout error or other login error occurs, an exception will be thrown and subsequent connection attempts will fail for the next five seconds, the "blocking period". If the application attempts to connect within the blocking period, the first exception will be thrown again. After the blocking period ends, another connection failure by the application will result in a blocking period that is twice as long as the previous blocking period.  Subsequent failures after a blocking period ends will result in a new blocking periods that is twice as long as the previous blocking period, up to a maximum of five minutes.
So, if the mysterious SQL connection issue is causing a connection to fail, then we should expect exponentially longer waits (up to 5 minutes) for each failure to connect. It would explain why our web method call is taking hardly any time to return usually, and then takes 30+ seconds at other times. 

I just need to prove that we are using connection pooling, or find out why we are not using connection pooling.  I hope that we are using connection pooling, because the connection failure blocking period seems to be a great explanation for what we are seeing. 

Leave a comment, or send an email, if you have any troubleshooting suggestions.

Tuesday, January 22, 2013

QuickCounters goof...

I can't believe how blind I was!  I added a quick counter for a WCF service because I wanted to track how long a certain process takes.  I added code similar to this:


// TODO: LEE DEBUG QuickCounters
var qcRequest = RequestType.Attach(QCCategoryOut, "SomeProcessName", true);
try
{
   SomeCode.Instance.DoStuff(asset);
   qcRequest.SetComplete();
}
catch (Exception ex)
{
   qcRequest.SetAbort();
   throw ex;
}

What happened is that the counter ended up showing some very large value for SomeProcessName's "Request Execution Time (msec)" value.  I couldn't figure out why something that was finishing in what looked like a second was displaying values like 1298793265 for Maximum value.

I spent a good bit of time searching for posts on QuickCounters having invalid values for the execution time, but couldn't find anything.  Finally, I noticed the problem - I forgot the BeginRequest() call!  I don't know if it is a feature of QuickCounters to not throw an exception if you call SetComplete() without first calling BeginRequest, but I suppose it is nice to not have code blowing up in production because you added QuickCounter code incorrectly.

The following code works just fine:


// TODO: LEE DEBUG QuickCounters
var qcRequest = RequestType.Attach(QCCategoryOut, "SomeProcessName", true);
try
{
   qcRequest.BeginRequest();
   SomeCode.Instance.DoStuff(asset);
   qcRequest.SetComplete();
}
catch (Exception ex)
{
   qcRequest.SetAbort();
   throw ex;
}





Friday, January 18, 2013

Playing with Geocoding APIs...

I decided to mess around with some geocoding APIs.  I didn't really care which API I used, but I wanted it to be free and preferably a REST API.  The REST API "want" is based on the idea that the code for processing the results would be similar for all the REST based geocoding services.

I used both the Yahoo Geocoding API (http://developer.yahoo.com/maps/rest/V1/geocode.html) and the Google Geocoding API (https://developers.google.com/maps/documentation/geocoding/).  

The Yahoo Geocoding API is free and allows up to 10000 requests per day.  The downside is that the API is deprecated, and it looks like they want to use a pay for use service.  In other words, it works right now, but who knows how long?

The Google Geocoding API is free and allows up to 2500 requests per day.  If you exceed the 2500 requests per day, then Google says they may block your usage for a while.  If you continue to exceed the 2500 requests per day, then they will block you indefinitely.  There is a response value for the Google Geocoding API that will let you know you have exceeded your query limit.

I wrote an app that implements both the Yahoo and Google geocoding APIs.  It will either take in a single address/place of interest, or read in addresses from an input file.  The input file is expected to have one address or place of interest per line.

For fun I decided to create a utility class that will take an array of Coordinates (a class that holds the latitude, longitude, and location), and calculate the total distance of each of the Coordinates.  

I found the formula for calculating the distance on stackoverflow.com (providing copy/paste solutions for the masses).  

http://stackoverflow.com/questions/3715521/how-can-i-calculate-the-distance-between-two-gps-points-in-java

I was able to find that the distance between the Space Needle, CenturyLink Field, Big Ben, Taj Mahal, and the Grand Canyon is:

Total distance (km): 27492.11
Total distance (mi): 17082.80

Kind of neat!  

I posted the code on github: https://github.com/leewallen/geocodehelper

If you have any ideas for adding on to the code, then please let me know.  It would be nice to make the app useful to someone.  Also, I wouldn't mind suggestions for any refactoring ideas.

Something that might be a fun Android/Solr project would be to do the following:
  • Create a Solr index for holding information about WIFI access points (this would include the latitude and longitude for the location where the WIFI access point was discovered) 
  • Create an Android app that will log latitude/longitude whenever it finds unsecured (or secured) WIFI access points (ie, Starbucks, libraries, etc.)
  • Index the information into a Solr index
  • Create a new Android app, or add onto the first Android app, to allow you to enter an address/place of interest, and then query Solr to get a list of available WIFI access points within a specified radius
If anyone has any related project ideas, then please share!




Monday, January 14, 2013

Homeland Security says, "Disable Java in web browsers."

The Department of Homeland Security told users to disable Java in web browsers due to security flaws that allow malicious code to be installed on their computer.  That's not so surprising.  What does seem surprising is that Oracle had no response.  That seems really weird.  Why not tell people that the problems are being addressed or being investigated?

http://news.yahoo.com/government-warns-java-security-concerns-escalate-160640366--sector.html

The other thing that I thought was odd about this is that the problem described on the Department of Homeland Security website had a near identical issue mentioned in August.

Here is a link to the latest issue:

http://www.us-cert.gov/cas/techalerts/TA13-010A.html

Here is a link to the issue mentioned in August:

http://www.us-cert.gov/cas/techalerts/TA12-240A.html

Perhaps these aren't completely identical issues, and that is why the solution is different.  However, it seems as if the Department of Homeland Security is tired of waiting for Oracle to fix the problem - or perhaps they are tired of Oracle saying the problem is fixed when in reality the problem still exists.

Whatever the case is - I will be happy when there are less Java applets in web applications.  I hate getting prompts related to whether or not I want to load/run an applet that is referenced in a web page.

The downside is that every now and then there is an applet that seems kind of cool.  For example, applets that can simulate how high a model rocket will fly depending on wind speed and direction, type and number of engines, staging of engines, and materials used for the various parts of the model rocket.  You can have a non-java applet site calculate the altitude and top speed etc, but I really enjoy the animation.

Update: Apparently Oracle made a fix for the flaw reported last Friday.  You just need to update to Java 7 Update 11 (hmmm...7-11).  However, there is another article that says that there are security flaws that have not been addressed:

http://uk.reuters.com/article/2013/01/14/uk-java-oracle-security-idUKBRE90C0JA20130114

So, I guess the message is - enable Java in your browsers because the security management issue has been fixed, but watch out because you're still not safe.