Wednesday, December 31, 2008

Using SGMLParser With IronPython


Mark Pilgrim's excellent Dive Into Python has a section on using SGMLParser and having seen nothing similar (and imagining its many uses!) I thought I'd give it a whirl in IronPython. I thought a good proof of concept would be creating a database out of link heavy sites.  Since I visit Arts & Letters Daily every so often and the closet intellectual in me likes to hang onto what I find there, I thought I'd target it:

import urllib2
import sgmllib
from sgmllib import SGMLParser

import clr
from System import *
from System.Data import *
from System.Net import *

class AlReader(SGMLParser):
def reset(self):
self.urls = []
self.pieces = []
self.track = 0
self.prePend = "No Category"
self.counter = 0

def start_a(self, attrs):
href = [v for k,v in attrs if k == "href"]
key = [v for k,v in attrs if k == "name"]
if href:
self.track = 1
elif key:
self.prePend = attrs[0][1]

def handle_data(self, text):
if self.track:
self.pieces.append("|".join([self.prePend, text]))
self.counter = self.counter + 1

def end_a(self):
self.track = 0

def get_links(self):
links = []
for i in range(0, len(self.urls)):
links.append("|".join([self.pieces[i], self.urls[i]]))
return links
#print "%s %s" % (self.counter, "Total links")

def get_link_datatable(self):
d = DataTable()
d.Columns.Add(DataColumn("Category", Type.GetType("System.String")))
d.Columns.Add(DataColumn("Site", Type.GetType("System.String")))
d.Columns.Add(DataColumn("Url", Type.GetType("System.String")))

for text in self.get_links():
newRow = d.NewRow()
newRow["Category"], newRow["Site"], newRow["Url"] = text.split("|")

return d

response = urllib2.urlopen("")
a = AlReader()
linkdata = a.get_link_datatable()
# write it out to prove we got it.
ds = DataSet()
ds.WriteXml("c:\\temp\\arts and letters links.xml")

If you find tihs interesting do make sure you look at Pilgrim's chapter on HTML Processing


Saturday, December 20, 2008

Parameterized IN Queries


I haven't listened to the podcast yet but saw a cool trick from Joel Spolsky on approaching parameterized IN queries. Purists will bemoan its lack of premature optimization but I think it's novel enough to study because of the approach: using the SQL LIKE operator on your parameter rather than a field, which is what people like me are used to. There's code on the StackOverflow post but I thought I'd paste some of the poking around I did in Sql Management Studio:

-- setup
FirstName VARCHAR(50) NULL

-- some data
INSERT INTO Person VALUES('Jonathan')

-- here's the magic
SET @FirstName = '|David|Trilby|'
SELECT * FROM Person WHERE @FirstName like '%|' + FirstName + '|%'

-- ported to a proc
CREATE PROC uspPersonSelector
@FirstNames VARCHAR(500)
SELECT * FROM Person WHERE @FirstNames like '%|' + FirstName + '|%'

-- showing it works
uspPersonSelector '|David|Trilby|'

-- somewhere in the netherworld of C#:
string[] names = {"David", "Trilby"};
SqlCommand cmd = GetACommandFromSomewhere();
cmd.Parameters.AddWithValue("@FirstNames", "|".Join(names));

Drop Table Person



Friday, December 19, 2008

Programmers as Goalkeepers


The 8th annual New York Times magazine Year in Ideas featured a section on Goalkeeper Science profiling this paper by some Israeli scientists called Action bias among elite soccer goalkeepers: The case of penalty kicks. In looking at the approach of keepers in some 286 penalty kicks they found that though 94 percent of the time they dived to the right or left, the chances of stopping the kick were highest when the goalie stayed in the center. The researchers theorized that the reason keepers behaved in this way was that they were afraid of appearing that they were doing nothing.

Immediately I remembered a blurb from an Paul Graham's What Business Can Learn from Open Source  essay where he expressed a similar dynamic for programmers:

"The other problem with pretend work is that it often looks better than real work. When I'm writing or hacking I spend as much time just thinking as I do actually typing. Half the time I'm sitting drinking a cup of tea, or walking around the neighborhood. This is a critical phase-- this is where ideas come from-- and yet I'd feel guilty doing this in most offices, with everyone else looking busy."

I wonder what Paul would say about the IBM commercial on ideating in which concludes that people should "start doing" after showing an image of people laying inert on an office floor, a stark portrayal of how a manager at IBM might see someone like Paul Graham.

As programmers much of what we should do may not appear to be work for the nonprogrammer and as a result many of us end up doing it at home. I spend a lot of time at home exploring different technologies in a kind of tangential approach that wouldn't look like "working" at work but often my best ideas and solutions come from here.  I also spend a lot of time reading technical books and blogs.

I'm wondering what it would look like if we could step back and look in a quantitative way at the performance deficits resulting from the desire to look busy at work. What would the workday look like? I'm wondering what an hour for reading, a few hours for exploratory/research programming, and the rest as project time would do for my own productivity.

If programming was goalkeeping was programming, Edwin Van der Sar would be quite the Python hacker. 


Thursday, December 18, 2008

Windows Forms + Web, WIB part II


A while back I made the case for applications that put together the strengths of Windows Forms and Web technologies (I thought of the catchy "WIB" as a name for this approach). The example I’d given then was a Windows Forms hosted Web Browser for local images that one could use for annotation that leveraged Windows for local file storage and a Web technology like jQuery for doing transitions in the user interface.

Today I thought of another use for this approach that wrapped itself nicely into a tool I've been using for some time to download mp3s from a given website1,2. I call the tool "Fortinbras" and if you find it useful I'd be delighted.

So how was Fortinbras changed?

Parsing the HTML for mp3 files was a little tricky. My initial approach was to use a regular expression against the text of the document which, truth be told, is a brittle approach. Part of why I never trumpted the tool was because I never completely perfected this tactic (while it worked well enough for me personally). My code looked as follows:

	WebClient wc = new WebClient();
string pageText = wc.DownloadString(browser.Url.ToString());
Regex re = new Regex("href=\"(?<url>.+?mp3)\"", RegexOptions.IgnoreCase);
Match mp3Matches = re.Match(text);
while (mp3Matches.Success)
string matchUrl = mp3Matches.Groups["url"].Value.ToString();
AddMp3(browser.Url.ToString(), data, matchUrl);
mp3Matches = mp3Matches.NextMatch();

Today my epiphany was that I didn't need to use a regular expression when I could use the DOM from Windows Forms to pull out the anchors that have mp3 destinations. Here's what that looks like:

	while (browser.Document.Body == null)Application.DoEvents();
HtmlElementCollection anchors = browser.Document.Body.GetElementsByTagName("a");
foreach (HtmlElement anch in anchors)
string linkUrl = anch.GetAttribute("href");
if (linkUrl.ToLower().EndsWith("mp3"))
AddMp3(browser.Url.ToString(), data, linkUrl);

As usual feedback is welcome - you can download a copy of the Fortinbras project here.

1I am aware of the Firefox extensions that do this but someday (imagine a pie in the sky look on my face) I was hoping to incorporate a "favorites" list with URLs / locations so that this would be a one stop shop for my downloading and organizing of podcasts. My goal here is embarrasment driven development so I'll probably be bummed enough about the code I've just posted to put in some enhancements as time permits.

2My friend's music blog is a great stop, try a couple of tracks at The Look Back.


Wednesday, December 17, 2008

Growing Open Source Community


Kevin Dangoor recorded an interesting screencast on some of the essentials of getting an open source project more widely circulated. Essentially Dangoor explains that having a successful open source project is not just about code, it's about good product management. I wanted to title this post with Dangoor's most quotable quote: "Rails is not where it is because of great code." But there are a lot of people who would take that as an opener for a religious war without seeing his real intent of highlighting marketing and management of a project.

I won't rehash it, it's worth watching it for yourself.


Tuesday, November 25, 2008

Bruce Eckel's Python Book


It escaped my notice but Eckel's been writing a new Python book. Should be worth digging into sometime in the future.


Sunday, November 23, 2008

How to steal time: On Programming and being a Father


It's been alluded to but here it is neither shaken nor stirred: I became a father exactly on month ago.  While it hasn't been without challenges the joy of being a father is something I understand now firsthand after being told or observing it in others. Even though I first was quite seduced by a "mystique" in programming - a nocturnal person drinking jolt with binary being reflected from mirror shades - I now realize not only that nearly every programmer and professional person I admire is a parent, but that they find their core of happiness in that well beyond the realm of work.

Nevertheless, I have been trying to keep up my personal work and interests so with that goal in mind I've adjusted my schedule to wake up as early as I can (usually 5am) and get a few hours of solid time while leaving the evenings as a time when I can help out and be a "family man."  It still doesn't feel natural but the payoff is that I can still advance my work with IronPython and other interests. It's already been pointed out to me that my productivity here and other places has dropped since my daughter was born.

So question for other programmer/dad people: how do you work the family schedule? Any tips and tricks for a one month father who is still daydreaming about what it was like to sleep all night?


Monday, November 10, 2008

Loop Reordering, Optimizing Face Recognition


Got this a few weeks ago from a friend but didn't have a chance to sit down and digest the entry until tonight. Pretty cool and although premature optimization gets a bad rap, it's a reminder that sometimes what seems intuitive and quick could actually be costing a lot -


Wednesday, November 05, 2008

Lake Wobegon Distribution


Not sure where I saw this one first but it's a great article. I wonder if the Lake Wobegon distribution applies as well to programmers...


Friday, October 31, 2008

LINQ to SQL death rumors


I think it all started with this post from Tim Mallalieu reporting some directions for LINQ to SQL and the ADO.NET Entity Framework. Ayende and others have had some strong reactions (LINQ to SQL is dead) and there's now a StackOverflow thread you can use to follow the discussion.

It will be interesting to track over the next few days what other responses pop up. A few things that come to my mind:

1. This is an opening for people using Linq to SQL to migrate to SubSonic or NHibernate. I've been using SubSonic lately and enjoying myself quite a bit.

2. A big disappointment for people like Benjamin Day who I saw speak at VSLive and who have incoprorated LINQ to SQL into their development projects. Next time wait for Microsoft to 2.0 something before spending too much time working it into your dev cycle.

3. I wonder if Microsoft hasn't thought of spinning off some of the tool makers as different companies. Perhaps there is an MBA out there who could draw some charts and convince management to have ADO.NET Entity Framework and the LINQ to SQL folks compete/copy in the same space. If the team isn't fully formed, why not make one? And then how cool would that be: profit centers that encourage developers to use Microsoft tools and competition to get the best ideas out the door. The trade off of Balkanization versus One True Product is that the former will mean that cool new ideas belong to you - the company won't have to get the ideas from outside sources. 

4. I wonder if we aren't moving towards a tipping point of Microsoft developers "getting" open source and shifting their tool/framework usage outside of Microsoft. I know lots of people who won't use __anything__ unless it has a Microsoft logo because it feels safer and more commoditized. Moves like this seem to go against that since open source projects tend to live a longer time and technologies / frameworks from companies become "obsolete" (how many versions of ADO can you count?) because of a product cycle that requires new purchases every 2 years or so. I would argue with my boss on a new project that it's safer to use NHibernate than ADO.NET Entity Framework if this indicates a chance that the Entity Framework could be the next LINQ to SQL project.


Thursday, October 23, 2008

.NET Survey


Scott Hanselman did a survey of .NET usage and posted results on his blog. A few interesting notes:

1. Heavy use of older stuff: ASMX, DataSet, WebForms, Windows Forms
You'd be hard pressed to find new blog entries on it but it shows that older (tried) technologies don't sync up with what's popular. The last few projects I've worked on have had development cycles that far exceed the excitement of their underpinning technology. One needs to remember that excitement and hype aren't what make something useful, it's use. So the people who talk down DataSets do so in the face of what obviously works for a silent majority who are busy getting stuff done.

2. Some new stuff won't take, some new stuff takes off
The respondents using CardSpace and WorkFlow are dismal. I can match the "unscientific" survey with a lot of my own experience in the last few years and conclude it's within a broader trend. On the other hand, LINQ To SQL has very heavy use for something relatively new, almost on par with ADO DataSets which one could argue are in the same competitive space. That says to me that in some cases adoption is just slow when there's a "legacy" counterpart for a technology but in others, when the new technology is compelling, people take to it very quickly.

3. WinForms > WPF
More than twice the respondents but here's my conclusion: it's all about the toolset, not the technology. WPFs tools are not on par with their WinForms equivalents. The dual Blend + VS strategy doesn't make sense for a large majority of us who don't have full time designers we work with and the lack of built in controls (such as a DataGridView) mean as cool as rounded edges are, function trumps form when programming.

4. MVC?
Last I checked this was *just* released as a beta. More people claim to use that than WPF which seems to indicate a Hanselman type majority in respondents. Even if I wanted to jump into MVC, until it's at a "1.0" release I couldn't (and I'm wondering how others seem to be doing so) justify starting or migrating a major product to a product/framework that isn't yet complete.


Wednesday, October 22, 2008

Recursive Copy ala LINQ


Tonight I needed to copy a bunch of images, no matter their level in the hierarchy to a single folder. Cake walk with C# and LINQ, I'll post the Python equivalent tomorrow.

string sourceDir = @"C:\thesource";
string targDir = @"c:\thetarget";
var files = new DirectoryInfo(sourceDir).GetFiles("*", SearchOption.AllDirectories)
.Where(p=> !p.Extension.Contains("db")) // exclude the pesky windows db file
.Select(p => p.FullName);
foreach (string path in files) {
File.Copy(path, Path.Combine(targDir, Path.GetFileName(path)));



Monday, October 20, 2008

Better Know a Framework: Zip / Package Multiple Files in C#


Making a single archive out of multiple files in .NET is easy once you know where to look but I spent far too long realizing there is a System.IO.Packaging namespace (not in the mscorlib, but you need to reference WindowsBase assembly).

// target file
string fileName = @"C:\temp\";

// point to some files, say JPEG - did I say I love LINQ?
var filesToZip = new DirectoryInfo(@"c:\temp\").GetFiles("*.jpg").Select(p => p.FullName);

using (Package exportPackage = Package.Open(fileName))
foreach (var file in filesToZip)
using (FileStream fileStream = File.OpenRead(file))
Uri partUriDocument = PackUriHelper.CreatePartUri(new Uri(Path.GetFileName(file), UriKind.Relative));
PackagePart packagePart = exportPackage.CreatePart(partUriDocument, System.Net.Mime.MediaTypeNames.Image.Jpeg);
CopyStream(fileStream, packagePart.GetStream());

// somewhere else, the CopyStream
private static void CopyStream(Stream source, Stream target)
const int bufSize = 0x1000;
byte[] buf = new byte[bufSize];
int bytesRead = 0;
while ((bytesRead = source.Read(buf, 0, bufSize)) > 0)
target.Write(buf, 0, bytesRead);

The trap in looking for this was that searches for this all point towards the GZipStream which is nice for a single file but not applicable for packaging multiple files. In my case I wasn't even interested in compression, just a single archive. You'll find additional documentation for the library here.

I'm half tempted to write a nifty tool in IronPython for command line packaging but in a world with 7-zip, that's simply foolish.


Saturday, October 18, 2008

Mashing Up: JQuery and Windows Forms, a first WIB


A lot of people are down on Windows Forms as something passe. I have a feeling there are a lot of people like me who not only must use it for practical reasons but also think there are advantages to fat clients that sit on the Windows platform. I'm getting to a point where I will start doing more than poking around with WPF but at present two factors hold me at bay: the control library is not robust enough (another tutorial of how to make a custom button in blend isn't incentive enough) and XAML is still a bit daunting for what usually end up being rather involved UIs. Here, for example, is a UI from an application I've been working on for a while (small intentionally to impart an idea of complexity without revealing any details):

I think a lot of "do it in XAML" problems can be solved with a more web development style paradigm in Windows. A simple example is template driven databinding with styles. That's easy to do in web applications but a little less flexible in a control based world (think a GridView with CSS versus the Windows Forms DataGridView).

As a proof of concept I decided to use the System.Windows.Forms.WebBrowser in a Windows Forms application to get cool features from JQuery while maintaining the strengths of a windows app. Here's the scenario: a photo manager application. It builds and manages a database of photos from a digital camera. Rather than uploading the 300+ pictures taken in a single session to Flickr (or in conjunction with that act) one can open it up, start annotating their files and saving the comments to a portable "database" that they can back up along with the files on their home machine.  On any machine with the application they point to the directory and can view their original comments along with the pictures. Here is a screen shot:

The strength in this scenario of Windows Forms is the ease with which one can build the editing interface: textboxes, buttons and other "drag / drop" pieces of the control library. The strength of JQuery is in the display - I've chosen to use an animation and the accordion to spruce up the interface where the user looks at their photos.

You can download the whole thing here. (Very 0.1 but posted as an example/idea - just use the next / previous buttons to view some photos, get a feel of things).

I wonder if this paradigm can be a kind of Ajax "aha" moment; Ajax was centered on the use of a single, relatively simple MSXML component. Though System.Web.Forms.WebBrowser is not simple, it's a single component that has a lot of room for leverage.  Ajax set up some pretty standard protocols / approaches to using asynchronous HTTP requests from a client - I think an interesting parallel could exist in wrapping jQuery and Html controls into a kind of framework that is easy for any developer to use with the WebBrowser control. Imagine a grid you drop with some properties and a CSS.  The last thing that made Ajax pop was the catchy name. I love naming things myself so I've been calling this kind of application a WIB (Windows + Web) but perhaps there's something more catchy one could use.

Just some ideas, I'd be curious in the following:

1. Any existing applications written with this approach.
2. Any potential applications that could benefit with the pros of both Windows and Web
3. Any glaring shortfalls



ps. Thanks to Corey for encouragement on the blogging tip.

Wednesday, October 01, 2008

YUI File Disjoin Complexity HELP


I got a little help yesterday from Dav Glass and Eric Miraglia for my issue on selection of YUI javascript files. What I'd had distributed over a bunch of files is now:

<script type="text/javascript" src=""></script> 

Hm... a bit better. Both Dav and Eric confirm that the canonical solution to this problem is to spend some quality time at the YUI Configuration Site which will allow you to pick the widgets and utilities you've leveraged in order to combine it into a more simplified call.

A someday project: the following style of implementation to make the reference a bit more crisp (I didn't add everything from above, but the pattern should be easy to observe):

<script type="text/javascript" src=""></script> 


Anyhow, thanks for the help.


Tuesday, September 30, 2008

YUI File Disjoint Complexity FAIL


So here's what has been getting my goat for a while with YUI and a very smple application I've been using it to build.

So for said application, I have to reference 12 separate Javascript files along with 5 separate CSS files.  All for this simple interface (datatable, tabs, button, ajax (connection), events and maybe a few more things I am not remembering):

This type of complexity is what makes YUI daunting for many developers and difficult to maintain; over at nRegex I'm referencing a bunch of older versions of YUI and if/when I do decide to update there will be quite a few places in which compatibility and errors have the potential in manifesting themselves.

YUI controls are nice once working but I'm at the point of wondering if this kind of disjointedness should lead me towards jQuery or Dojo. Especially with Microsoft's recent adoption of jQuery, I'm leaning in that direction. But I don't want to be rash so I'll ask:
1. What is the canonical solution e.g. blessed by Yahoo! (I see a YUI loader of some kind)?
2. What does that look like with the 5 CSS and 12 *.JS files I've got above?


Monday, July 28, 2008

Reading Source, Minima III


As my reading of Minima source code continues I wanted next to comment upon the database interaction which is novel (to me at least) in that it is exclusively a LINQ implementation with no stored procedure or ADO.NET implementation.  I've seen LINQ quite a bit, but many using it seem to stick to using the query syntax rather than using raw lambda expressions.  All the LINQ in Minima that I've seen uses lambda expressions and I've  come to appreciate it as a clean, sparse syntax.  To illustrate a more trivial example, assuming I have a table Users with Username / Password columns I wanted to verify for access. The first step with all things LINQ is to make your data model, which amounts to a DBML file that you map database objects to with drag and drop. Next you can write query expressions to pull data out of the database - notice that the db variable here points to the data model as embodied in a "data context" object:

            var authUserQuery = from u in db.Users 
where u.Email == Login1.UserName && u.Passphrase == Login1.Password
select u;

Unfortunately the query expressions return an IEnumerable<T> that doesn't support a Count or Length property. It's a somewhat inflexible thing that in my *very* limited experience is useful mainly for iterating with a ForEach - and I suspect it's reasons like this that led Betz to using a lambda syntax that is much cleaner and gets more bang with less code:

            Func<Data.User, Boolean> checkUser = x => x.Email == Login1.UserName && x.Passphrase = Login1.Password;
Data.User u = db.Users.SingleOrDefault(checkUser);

You can use nullability to test for the presence of a record based on the LINQ query. Some source from Minima that demonstrates this is verification of the existence of a blog author:

            Func<AuthorLINQ, Boolean> authorExists = x => x.AuthorEmail == authorEmail;
authorLinq = db.Authors.SingleOrDefault(authorExists);
if (authorLinq == null)
throw new ArgumentException(message);

It's truly novel and I don't mean to make that sound like a trivial or ornamental type "novel" I write; after spending so much time building a data access layer with stored procedures and ADO.NET (Connection, Command, Reader, DataSet, ad nauseum) it seems like our collective destiny is to walk away from that and let something like a DBML data context do black magic for us.

Even though the ADO.NET Entity Framework is getting a FAIL vote from some people, I'm interested to see how this overlaps with LINQ. Even after listening to Hanselman's interview of Mike Pizzo I'm still a little unclear. My plan of action is to get serious about NHibernate to try to understand, as Scott would say, "the gestalt" of an ORM and then make the call on ADO.NET Entity Framework when it's released.


Thursday, July 24, 2008

Twining, TwyDL


A quick note that Twining isn't dead and that I actually have been working on it. Some work I've done in the past few months has involved a lot of databases I don't have more than an ISPs user/password level of access to so it's been quite handy to copy data in and out with brief pieces of code. Here's one feature I've used, exporting a table definition and all of its contents to a scripted insert statement in a file:



Another piece that I've been working on is the ability to generate schema objects in a more fluid, compact way. I like SQL just fine but after seeing how easy it is to make data structures with the Google App Engine I sought to shoot for something similar. Also within the flow of writing a Twining script it's easy to keep this type of syntax Pythonic for certain scenarios (you want to create a temp table and dump data into it from somewhere; you want to create a table and generate sample data, etc, etc). I call it TwyDL as a play on DDL:



The methods you see for defining column types take optional parameters so you can pass additional specifications when you feel the need. For example, the string columns default to varchar(50) but if you wanted to have a specific length all you'd need to do is:

col.string("FirstName", 25)

The same is true for numeric types:

col.numeric("Salary", 18, 4)

I've got one more thing I'm interested in implementing (time! time!) which is generating sample data given some table definition. I'm hoping to get up a rough cut of it all in the next day or two, after I've updated unit tests and organized things.


Monday, July 21, 2008

Reading Source, Minima II


Yesterday I wrote that my next stop was the use of Generics with WCF. Sometimes, it seems, a lot of experience is a handicap to learning new things - such was definitely the case when it comes to the full use of Generics with C#.  Most of my usage, I admit, has been limited to Generic Collections such as List<T> or Dictionary<T, K>.  But there is a beautiful leveraging of Generics I see with WCF that can be applied all over the place.  As a trivial example, consider the code that is used to safely read values from a database field.  I will admit to writing a lot of code like:

class ReaderHelper{
public static void GetSafeString(object field);
public static void GetSafeDecimal(object fiel);
// and so on.

// somewhere else in code:
string firstname = ReaderHelper.GetSafeString(rs["FirstName"]);
decimal salary = ReaderHelper.GetSafeDecimal(rs["Salary"]);

A lot of redundancy and little extensibility. Here's how a WCF infected mind might write that:

class ReaderHelper{
static T SafeField<T>(object field, T defaultValue){
return (field == DBNull.Value) ? defaultValue : (T)field;
// somewhere else in code:
string firstname = ReaderHelper.SafeField<string>(rs["FirstName"]);
decimal salary = ReaderHelper.SafeField<decimal>(rs["Salary"]);

If you coupled the above with some inversion of control you'd have a pretty powerful and extensible code for dealing with any potentially null field type coming from your database.  It's not that such use is an innovation exclusive to WCF, I see it used a lot more there.  Back to Minima, one can see a good example of this in the project responsible for exposing configuration values:

namespace Themelia.Configuration
public static class ConfigAccessor
public static T ApplicationSettings<T>(string key);
public static string ApplicationSettings(string key);
public static T ApplicationSettings<T>(string key, bool isRequired);
public static string ApplicationSettings(string key, bool isRequired);
public static string ConnectionString(string key);


More on Minima later (hopefully tomorrow), taking a look at database access.


Sunday, July 20, 2008

Reading Source, Minima I


I'm interested in the notion of deliberate practice as it relates to software developers.  How do we "practice" effectively?  One of the things I've been wanting to be more serious and deliberate about was reading source code.  Hanselman has inspired me with his "weekly source code" posts and since good ideas are worth copying I'll start to document some of the projects I download and look into for examples of good code.

The project I'm currently working on involves a lot of WCF. I've got mixed feelings (as well as a lot to learn) about it so I thought I'd start off by running Minima, Dave Betz's WCF/.NET 3.5 blog engine as a sample of an end-to-end application I can model my own WCF architectures after. 

Minima was not easy to get going. While I'm perhaps not the brightest kid on the block, the biggest item I struggled with would probably be a challenge to anyone: the certificates used in the WCF security model. While Betz has a blog entry documenting the basic usage of certificates with WCF, it is predicated on the WCF host not being IIS. For the blog I assumed everything should be deployed to IIS and ran into the a lesser known fact that one must use the Windows HTTP Services Certification Configuration Tool to enable the ASPNET account access to the certificates for their usage. There were also a couple of post build events in the project that I removed after getting my references set up.

But before I get into particulars, I wanted to go over the broader structure of the project.  The Minima solution has the following projects:
* Minima.Service
This has the service and data contracts with no implementation
* Minima.ServiceImpl
This has the implementation of the service
* Minima.Configuration
A configuration helper which makes use of a library called Themelia (Betz is, apparently, a Biblical Greek scholar on the side)
* Minima.Web
Library for the blog website's features
* Website
A website project
* Website Service project (exposes services for posting content to the database via WCF)

I'll write about more particulars in my next post but for this overview my biggest takeaway in the project structure is decoupling.  Separating the service from the service implementation, separating the website from it's functionality by a project with the only purpose of implementing that backend logic, and separating general purpose frameworks like Themelia (for reading the config file etc).

When I start my own projects I usually think too much about doing a proof of concept and less about decoupling the overall structure. As a result most projects start off as a "sketch" in a single Visual Studio project and by the first release end up staying that way on a permanent basis. But perhaps what we should focus on as programmers is less the cleverest algorithms and more the most elegant structure for building solid applications.

Next stop: WCF and Generics.


Wednesday, July 16, 2008

Tragically Well Known


Of late there seem to be a lot of the well knowns of blogging bowing out (at least in volume) because of how much work it takes to qualify and disclaim the topics they take upon themselves. I find it sad because even the essays I find myself disagreeing with usually provide the same thought provoking effect as the ones I enjoy in agreement. It's strange to think that people like Paul Graham and Joel Spolsky would be exasperated by the barbs people toss their way and yet, it seems, they are.  Paul's essay on disagreement seems evidence of this and the most recent Stackoverflow podcast dwells considerably on the topic from Joel's perspective.

I would like to make the following proposal both in encouragement of people like this writing and for people like myself enjoying it: rather than quick barbs and/or "jumped the shark" posts, return to the previous material of the author and enjoy an old post. It's not that disagreement is bad or unwelcome; I think a lot of the well knowns enjoy a spirited discourse.  It has more to do with people either misunderstanding or disagreeing in the wrong way - DH0 to DH4 as Paul would have it.


Sunday, July 13, 2008

Predictably Irrational: Startups and Open Source


The Herding Code podcast took up the topic: "Why don't startups run on Microsoft?" which I thought was interesting - it crossed paths with comments in a spat I've followed between Atwood (who chose .NET as a platform for Stackoverflow) and Marco Arment (of tumblr fame (side note: I actually have a tumblr)).  Had you asked me before Friday for an opinion I would have probably attributed it to the Byzantine licensing schemes for Microsoft products as well as marketing oriented decisions like making IIS 7 unavailable on XP. 

On Friday, however, I heard a commentary on Marketplace from Dan Ariely that is a basis for a hypothesis that hadn't occurred to me but seems to make perfect sense for software.  Ariely, a behavioral economist at MIT, starts with suggesting that the next time you eat out with friends, it might be better to have one person pick up the "pain of paying" for dinner rather than distributing it across all the people who were out. According to the commentary (emphasis mine):

Findings from behavioral economics tell us that one person should pay the entire bill and that the person paying should alternate over time. When we pay any amount of money, we feel some psychological pain. We call this the pain of paying. This is the unpleasantness that is associated with forking over our hard earned cash. But it also turns out that this pain does not increase linearly with the cost of the meal. This means that when we double the payment, the pain doesn't double; it increases just by a bit. In fact, the biggest increase in the pain of paying comes when we switch from paying nothing to paying something. Now, it's easy to see why one person should pay the entire bill. How so? Well, if every person paid their share, they would all experience some pain of paying.

If you're curious about how Ariely is able to make that assertion, he elaborates on this and other ideas in his book Predictably Irrational.

It makes perfect sense that this could be a credible idea behind why startups don't work with Microsoft technology - that the cost is less of a factor than "costing" some dollar amount.  If you attach to that complex licensing and many flavors of the same product and you get pain that goes beyond the "pain of paying."

It makes some sense on the Microsoft side of things too; rather than trying to determine which products developers should get matched to, most companies just buy a high end MSDN subscription for developers even though chances are high that not all the products are actually used.  They can count on "enterprise" companies conservatively overpaying and hence getting the most out of them.

A couple of thoughts that come to mind:
1. How could one go about trying to measure the "pain of paying" with software?
2. Does "free" open source translate to more time in managability/expertise?


Friday, June 27, 2008

Change Dispensed


Steve posted a coding challenge a few days ago - a change dispenser. I thought it would be a nice exercise for my fledgling Python skills and implemented it with just a small variation on pluralizing coin denominations.

def make_change(amt):
output = "Change is "
coins = (['quarte|r|rs', 25], ['dim|e|es', 10], ['nicke|l|ls', 5], ['penn|y|ies',1])
for i in coins:
r = amt // i[1]
if(r > 0):
coinout = re.split('\|', i[0])
output += "%d %s%s " % (r, coinout[0], coinout[1:][r > 1])
amt = amt - (r * i[1])
print output


A few things I learned along the way:

* I was reminded that dictionary objects do not guarantee order.

* I was using Python regular expressions to do a match on the string "this |and|or that" and discovered if you use match with simply \|(\w+) it returns nothing since the match must start from the beginning of the string!

* I was going to use a lambda expression for the pluralization but that's such a large hammer for what in the end would become a split

Python's behavior with booleans is another point of interest - when I test the following: coinout[1:][r > 1] I'm taking advantage of anything true being a 1. I wonder if this is "bad" but it's a seductive thing to take advantage of...


Monday, June 23, 2008

Deep Fried Bytes, Yegge, Interviewing


Found a good podcast today, "Deep Fried Bytes." Skip the first episode but the second on interviewing is a gem. Although it was posted on May 29, it ties in quite well with thoughts delivered (not simply written, but delivered) by Steve Yegge in his recent post on finding good people, Done and Gets Things Smart. Not only is it entertaining to hear Ayende Rahein take nonusers of using blocks to task, but later Scott Belware seems to be on the same page as Yegge on how you can know if people are "good."  Here is his approach (listen to the podcast if you want total accuracy):

- I don't ask questions I do pair programming for interviews...
- Interview questions are irrelevent... most of the people asking interview questions are showing the people who are interviewing what they know
- Ask for a code sample with unit tests that run inside Visual Studio
- If that's okay ask them to come in for 4 hours of pair programming
- Most interview related items would come up during the pair programming session


Thursday, June 12, 2008

C# Multiple Replace Extension Methods, Levithan Style


A couple of nights ago I blogged about an implementation of a technique to replace multiple patterns in a string on a single pass. Steve Levithan had an entry on the approach (not mine in specific, just the approach and commenting on a few weaknesses). It inspired a few things out of me: first, making the multiple replace an extension method of the string class, and second to duplicate Steve's approach which enables a more robust model because you can use metasequences etc...

Here is the code (included the using statement since the use of ToArray() from the Dictionary key collection isn't available without System.Linq):

using System;
using System.Linq;
using System.Collections.Generic;
using System.Text.RegularExpressions;

static class RegexExtender {
public static string MultiReplace(this string target, Dictionary replacementDictionary) {
return Regex.Replace(target,
"(" + String.Join("|", replacementDictionary.Keys.ToArray()) + ")",
delegate(Match m) { return replacementDictionary[m.Value]; }

public static string LevithansMultiReplace(this string target, Dictionary replacementDictionary)
foreach (string key in replacementDictionary.Keys) {
Regex r = new Regex(key, RegexOptions.None);
target = r.Replace(target, replacementDictionary[key]);
return target;


Here is some usage:

// the original approach, as an extension method
string x = "Holly was a hunter";
Dictionary rdict = new Dictionary();
rdict.Add("Holly", "Hannah");
rdict.Add("hunter", "hatter");

// Steve's technique
rdict = new Dictionary();
rdict.Add(@"[A-z]", "x");
rdict.Add(@"\d", "y");
string test = "David is 33";


Tuesday, June 10, 2008

Regex: replace multiple strings in a single pass with C#


I wish I could say I was the clever one to think of this but I ran into it in my copy of the Python Cookbook (the original author is Xavier Defrang, the Python implementation here). It's cool enough that I ported it today - I'll know I'll use the C# implementation of it quite often:

        static string MultipleReplace(string text, Dictionary replacements) {
return Regex.Replace(text,
"(" + String.Join("|", adict.Keys.ToArray()) + ")",
delegate(Match m) { return replacements[m.Value]; }
// somewhere else in code
string temp = "Jonathan Smith is a developer";
adict.Add("Jonathan", "David");
adict.Add("Smith", "Seruyange");
string rep = MultipleReplace(temp, adict);


Sunday, June 08, 2008

Languages And Thinking


I'm approaching the end to a rough week and should be coding but the virus scanner on my office machine is slowing development to a point where it's not bearable for me.  I'll pass the time with a quick thought about how programming languages affect thinking and why it makes me both picky about languages I use and also curious to see how the different languages solve problems.

I will confess that the project I've been working on is written in VB.NET.  The reason for that was very practical since our client was adept at VBA and needed to, on occasion, get to the code level of things.  I'd spent a lot of time in Visual Basic (the "classical" VB) so I thought nothing of it - perhaps somewhat of a refreshment after so many years away.  As we wrote more and more code, I started noticing a lot of nested "If" statements in the style of:

If Some Condition Then
If SomeOtherCondition Then
If YetAnotherCondition Then

You get the picture.

I remember working with a different VB programmer and she was also given to writing a lot of nested conditions in the same manner even though we were using C#:


At the time it really bothered me in a "code smell" sort of way; nesting conditions, I would have argued, invites clutter and entropy. Since I've been attending Hanselman University for a while now I can say with more technical gravitas that they increase cyclomatic complexity and this was why they smelled bad. The alternative I would have wished for would have been:

if(cond && cond2 && cond3){

Pretty natural, right?  Actually no, especially if you're coming from Visual Basic because the Visual Basic And and Or operators don't do short circuit evaluation.  If you combine boolean expressions you have to make a mental note to verify that they both will work at runtime - that there is no dependency between them.  That's something a C# progammer completely takes for granted often writing code like this which is a ticking bomb in VB.NET (it won't go off until you're demonstrating the app to your client, trust me):

If Not MyCollection Is Nothing And MyCollection(0) = "Something" Then

One expects the second expression will be passed if the first fails in a "short circuit" approach. Unfortunately, Visual Basic .NET will attempt to evaluate the second even if the first is not true!

To clean this all up the VB.NET folks provided the AndAlso and OrElse operators which do traditional short circuit evaluation. I read somewhere that the reason And and Or were left as is was for backward compatibility and people who were "upgrading" their code from Visual Basic to Visual Basic.NET. Using these operators you're back to thinking "normally" as a C#, C, Javascript, etc... developer.

So back to my original thought - nested If blocks are a simple way around an operator that doesn't short circuit and it's a way of thinking if you've spent a lot of time writing Visual Basic code. The language is what drives that even though there may be alternatives. Even though it may produce cyclomatic complexity. Some people tell me picking a language is just a simple process of picking a tool, but I think it's more than that; it's picking a way of thinking. That's why it's not only important to me, it's interesting to see how different languages lead us in different directions. Even if an idiom is available elsewhere, it's getting to think that way that is most often  the trick to being better in both languages.

One more side note, I can chart a few different ways in which new languages have affected me. Take a look at the type of thing I did before a lot:

foreach(string s in stuff){	
master += s + ",";
master = master.Substring(0, master.Length -1);


// like a map, I first started thinking this way from perl
stuff.ForEach(delegate(string s) { master += s + ","; });
// like a join, very Pythonic
String.Join(",", stuff.ToArray())


Sunday, June 01, 2008

Galloway et. al. Roundtable Podcast


If you haven't and you do Microsoft development, give an ear to John Galloway, K. Scott Allen, Scott Koon, and Keven Dente in their "Technology Roundtable" podcast.  There are a lot of podcasts I find entertaining, (like Joel and Jeff's discussions) but these guys seem to do what I do - and to contextualize that since it may sound presumptuous, they deal with Microsoft development tools building "real world" software.  

I try to have "take aways" from podcasts so from the first more impetus to investigate Ninject and dependency injection as well as more investigation of LINQ to SQL and the ADO.NET Entity framework (opinions flew on these, I'd like to develop some of my own).  Finally, even though I'll probably skip the beta, I will look forward to .NET 3.5 SP1 since ASP.NET Dynamic Data will be packaged therein.

The second podcast had a great discussion on javascript libraries. I've used jQuery and YUI and Prototype on different projects and while my fondness for Prototype/Scriptaculous is probably greatest of all I am always eager to hear other people's experiences. The framework I have yet to do anything serious with is Dojo - I did some quick prototyping of their flyout menu with ASP.NET server code and I hate to say it but the ASP.NET Menu control was a lot more effective and fast. While it wasn't explicitly stated, this podcast has dampened my interest for the ASP.NET Ajax stuff - I looked at it again a few weeks ago and it seemed so... heavy.  I'm not sure who mentioned it in the podcast but they said it seemed more oriented for writing controls versus Ajaxy type applications.  I would have to concur with my limited knowledge.

I'll stay tuned, it will be interesting to see what they talk about next time.


Saturday, May 31, 2008

Knowing C


I've been having a good time listening to the StackOverflow podcasts. Joel's a great curmudgeon and Atwood is as opinionated as ever.  An ongoing disagreement between the two has been the usefulness of knowing C.  Joel's adamant about its importance and Atwood thinks there is much else a developer can spend their time in understanding. 

Eric Sink has since chimed in with a post called C and Morse Code wherein he reveals his cards: like Joel, he thinks it's of vital importance if you want to reach past mediocrity:

I'm not going to take a black-and-white stance on this.  I won't go so far as to say that every developer must learn C.  I've met lots of developers without C experience who are successful and making positive contributions to important software projects.

Furthermore, I'll admit that knowing C is not a magic solution to poor skills.  A lousy developer who happens to know C is simply better equipped to hurt himself or somebody nearby.

However, I can say these two things:

  1. All of the truly extraordinary developers I know are people who really understand the kind of low-level details that C forces you to know.
  2. Every programmer without C experience has a clear path of personal development:  Learn C.  Get some real experience using C to write a serious piece of software.  Even if you never use it again, you'll be a better programmer when you're done.

My relationship with C is tenuous at best. I spent a few months some years ago delving into it (and C++) seriously - I'm curious to go back and find some of that code but it was a few machines ago*.  I remember during that time I'd work on little toy programs (mostly an implementation of some algorithm) in C and then have the need for a utility and write it in something else.  It would be interesting to take Eric up on his second point of writing a "serious piece of software" using the language.  One interesting angle on this is how one would keep said "serious piece of software" strictly in the realm of C, without venturing too much into the world of C++ and object orientation done poorly.

So brainstorm question: what's something nice and useful that could be implemented strictly within C? (That one might not yawn and think *gosh* that would take 1 minute to do in Python).


*Great memories of Me, Markus, and bcc32 in a coffee house somewhere in southern California.  Too far gone are those days...

Thursday, May 29, 2008

Splitting Files with Python


I've been recently needing to generate sql scripts from large Excel spreadsheets. But once the script is finished I've had issues getting SQL Management Studio to execute as large a script in a single run. The solution? Split up the file with a little Python that takes it's arguments like this:

ipy  LargeFile.sql 3

#import clr
#pass arguments like so:
# ipy LargeFile.sql 4

import sys

if(len(sys.argv) == 3):
splits = int(sys.argv[2])
f = open(sys.argv[1])
data = f.readlines()
lc = len(data)/splits
print "number of lines", len(data)
for c in range(0,splits):
outstream = open("data" + str(c + 1) + ".sql", 'w')
for line in data[:lc]:
del data[:lc]
print "Expected arguments: file and number of splits\n example: ipy LargeFile.sql 4"



Tuesday, May 20, 2008

It's speed that counts


I usually don't talk about hardware because I don't pay too much attention to it unless something is bothering me.  However I've discovered something interesting about myself in the last week.  I've done a lot of my development over the last few years on an enormous HP zd8000.  I jokingly called it "the 747" because of its size and girth - 10 lbs and a 17" screen.  For a guy like me who's usually also carrying a few books in his bag it's quite a load as evidenced by my going through at least one laptop bag (my current one is also in poor shape). 

So because of some changes at work I got to try out a machine I'd liked the thought of - another HP but this one with a small 13" screen and weighing perhaps 4 lbs.  It's a great little machine and has a cool look but after about a week I'm back to the 747.

Why go back to the back pains and encumbrance of this massive machine?  One simple reason: speed.  It's got twice the RAM (2GB) and a much faster processor.  I didn't think about this much but for a person like me who is usually running an instance of SQL Server, an IDE of some sort, a text editor, web browsing with 10 tabs open, listening to music, chatting up friends (Messenger is no joke when it comes to resources), etc, etc - you get the picture - it's frustrating to be on a beautiful, compact machine that you have to wait around for. I'd rather exchange power with encumbrance for convenience with time penalties.

Or, as I like to joke with my South Dakota friends, I'm like a guy who exchanged his Chevy Silverado for a crossover vehicle... until he realized that he made his living hauling lumber.

Footnote: I ordered a Thinkpad T61 which should be a foot in both worlds. I guess my bit of the economic stimulus went to China.


Monday, May 19, 2008

Nregex mention


Steven mentions Nregex in a list of regular expression testers online. Nregex is still up and still useful especially if you need to work with .NET's implementation of regular expressions. A coworker of mine just used it to build a parser for Sql Reporting Services RDL files.  Steven's own RegexPal is a fairly intense implementation of regular expressions in javascript, complete with syntax highlighting.


Monday, May 12, 2008



Matt from 37Signals blogs about workaholics with the following assertions: they don't get as much done (most of the time) and they focus on inconsequential details.

Many leapt to the defense of workaholics - people who, it seems, are workaholics themselves. Because I'm often labeled a workaholic I'm trying to see past my emotions and yet it still doesn't smell like the truth to me.

And even more so because this weekend I watched Triumph of the Nerds, Cringley's chronicling of the personal computer industry from it's humble roots in what would become Silicon Valley.  As he interviewed people, I couldn't help but think that software development is experiencing a culture change.  The people who got the boat off the ground were almost entirely obsessed with their work, even down to the details.  I have a hard time imagining an Andy Hertzfeld, Woz, or young Bill Gates as a 501 developer.

Even if you go forward a few years, guys like John Carmack don't fit the mould of "balanced life/time to go pick up my kids and watch TV!"

These days though, I think what used to be hobbyism is now simply work and fair game for any person who wants a way to earn their keep and "clock out" for life afterwards.  This is not to say that it wasn't that way before, it just seems like much more commitment was involved.  Or maybe the moral is that no one with a 501 development attitude did anything noteworthy.

But even as I write that and play my hand as a kid who grew up on the folklore of the early computer industry I have to do a gut check because me staying at work late building yet another website for someone is not the same as writing the first GUI.  Not even close...

I take away the notion that there isn't necessarily a direct relationship between time spent at work and productivity but I also know that if I had the 9-5 attitude with no tendency to "get into" my profession, I might as well be an accountant. And as grandiose as it may seem, I'd love to have one idea that really matters versus a lifetime of mediocrity so I could rush home to have a "life."


Saturday, May 10, 2008

Old Computer Books



I was recently feeling ashamed of myself after reading Atwood's Programmers Don't Read Books...  post for what he called Programming book pornography: "The idea that having a pile of thick, important-looking programming books sitting on your shelf, largely unread, will somehow make you a better programmer."  To clarify, I actually do read the books I have bought but I'm guilty of keeping a full shelf for the sake of showing off my long and continuing struggle to be a good programmer. 

One way I can soften this sort of conceit is by thinking of how I'm really proud and boastful of my friends in real life who do things that amaze me. I'm not shy to boast on their behalf.  In the same way a lot of these old books are like old friends that have seen me through some pretty turbulent times.  I carried Francesco Balena's Programming Visual Basic 6.0 around for years when I was training people on VB6, COM, and ASP.  Another set of heavy books I spent many a quality night in a hotel room with were Gary Cornell's Core Java and Core Java Advanced Features. I don't have a formal computer science education but I consider a large part of my education the 7 or so years I spent on the road, in various hotel rooms, reading and practicing what I needed to know.

If truth be told there are a few there that I didn't get much out of.  I never did run Slashcode and I never did more than tinker with Bryce.  But I'm not ashamed to say that I had hopes of doing so that time supplanted with other things.

I packed them away and made room for some of the books I have piled on top of the shelves.  Since I don't travel much and remain in project mode I'm not as efficient about reading what I have but a smaller shelf is more tidy and palatable.  I won't wait so long before my next big cleanup.

So the big question: what to do with these books?


Thursday, May 08, 2008



John Resig of jQuery fame has released a library called Processing.js for javascript graphics. Yes, you read that right: Javascript graphics.

Earlier this week I was in some training and the instructor asked what future we saw for Silverlight.  My response was that I can't drink the koolaid just yet; while there are certainly applications in streaming media in which Silverlight will compete to the death with Flash, for web applications I see people taking Javascript to the level where its maturity with the browser makes most applications feasible. I also like the competitive environment around the Javascript libraries - the Dojo, jQuery, YUI, Scriptaculous, and other people trying to outdo one another just means better ideas, faster turn arounds, and a better experience. 


Monday, May 05, 2008

C# Extension Method for Generic Collections


Tinkering a bit with extension methods tonight, inspired in part by Scott Hanselman to write something I've frequently needed with generic collections: to spit them out in some delimited format.  Here are the extension class and method:

    static class EnumerableExtensions
public static string AsDelimited<T>(this List<T> obj, string delimiter)
List<string> items = new List<string>();
foreach (T data in obj) {
return String.Join(delimiter, items.ToArray());

You can spit out your delimited instances of any List<T> now:

            List<string> test = new List<string>(new string[] { "David", "Morgan", "Philip" });
Console.WriteLine(test.AsDelimited(" => "));

List<int> primes = new List<int>(new int[] { 2, 3, 5, 7, 11, 13, 17 });
Console.WriteLine(primes.AsDelimited(" , "));


Thursday, April 24, 2008



I'm still trying different things with Twining.  I'd thought about writing some "front end" type experience for usage but it really crystallized as a need when I showed it to a person I know and there seemed to be a disconnect in how it could be used.  For me it's natural to set my path environment variable, launch favorite text editor X and then run things from the command line or a script, but it's a nuisance if you're used to a one stop shop for being able to use some tool.  And as much as I want language as the focal point in the word "tool" there is something practical in the notion of something you download and click a button to execute with.

Enter Twy, which I pieced together after looking at a few samples of a hosted DLR engine in a Windows Forms app.  Now one need not figure out how to install or configure anything, or worry about creating and disposing of script files.

If you want to write something that hosts the DLR engine, take a look first at these samples on Voidspace.  There are other samples online if you hunt and peck but be aware that things have changed between the various releases of IronPython.  A few gotchas for me:

1. Redirecting standard output:

// where engine references the ScriptEngine type
// and ms references a Stream of some sort
engine.Runtime.IO.SetErrorOutput(ms, Encoding.UTF8);
engine.Runtime.IO.SetOutput(ms, Encoding.UTF8);

Many examples of this are deprecated for the IronPython 2.x beta

2. Referencing classes in mscorlib:

Be aware that doing the following:

import clr

is not going to be enough to get types out of mscorlib.  Although types will load from System, you'll need to get a reference to the assembly directly if you plan to use it in your hosted engine.  I had a little trouble with the StringBuilder but easily resolved it with the following after a tip on the IronPython mailing list.

Assembly assem = Assembly.GetAssembly(Type.GetType("System.Text.StringBuilder"));
scope = engine.CreateScope();

3. The only novel thing I did that I didn't see a lot of was loading a module so that you could utilize it with your hosted engine.  I added to the project and set Visual Studio to copy it to the compile destination.  I then have the following code which keeps the module available for later use:

string p = Path.Combine(Environment.CurrentDirectory, "");
scope = engine.Runtime.ExecuteFile(p);

// later on:

ScriptSource source =
object res = source.Execute(scope);

All in all not rocket science, it's amazing how much power one has at their fingertips in such a small application.  I would love to see other modules, especially ones that define some interesting type of DSL, have utilities like this that let you play around without much effort.

Oh yeah, the project and source.  Download it here, I'll clean up a bit more later.


Sunday, April 20, 2008

Getting better with meta-thinking


Some excerpts from a great post from Ola Bini, one of the JRuby core developers:

In short, I believe that being able to abstract and understand what goes on in a programming language is one way to become more proficient in that language, but not only that - by changing your thinking to see this part of your environment you generally end up programming differently in all languages...

A little further on, emphasis is mine:

... I call this meta-level thinking. I think it's mostly a learned ability, but also that there is an aptitude component.

Cheers for this as a "learned ability" which would give one pretentious enough to call his blog "Metadeveloper" some hope.


Saturday, April 19, 2008

Generic Enum Parsing in C#


Generics, as it were, are passe to talk about these days. However, I found myself dealing with enumerations on Friday and was a little surprised there wasn't an approach to parsing these out in a more generic fashion. I scribbled the following, perhaps it will be of value to someone:

enum Test { 
public static T EnumParser<T>(string givenValue)
return (T)Enum.Parse(typeof(T), givenValue, true);
Test val = EnumParser<Test>("v1");


Tuesday, April 15, 2008

Sense of urgency


One thing I’ve come to realize is that urgency is overrated. In fact, I’ve come to believe urgency is poisonous. Urgency may get things done a few days sooner, but what does it cost in morale? Few things burn morale like urgency. Urgency is acidic.
- read the whole post

Jason Fried of 37Signals neglects to mention one aspect that seems to come back over and over to haunt those of us who live with "urgency"; that it costs dearly to rework things that were done in a hurry.


From TFS to SVN


My project at work added some remote developers and it was decided that rather than try to figure out TFS, which we use for everything internally, we'd use SVN. I'm not used to SVN since we used Team Foundation Server for source control and Visual SourceSafe before that - I've used TortoiseSVN quite a bit to get source code from projects online, but never as a primary source control environment for a big project.

So far it's a breeze. I installed VisualSVN Server and configured repositories and users - this took about 10 minutes.  After that I was up and running with TortoiseSVN in about 3 or 4 minutes.


Because in a parallel universe a completely different project I'm working on required me to install the TFS client for Visual Studio 2008.  On my super duper 4GB Ram, 3 Ghz, desktop machine at work it took about 45 minutes.

Just the client.

So yeah, I've been thinking about that contrast quite a bit all day.  One thing I think is interesting is the contrast between small teams as I heard discussed at CodeMash earlier this year and the features of TFS: all the note taking, task assignment, iteration,  etc, etc... On a "one to two pizza" team (e.g. a Google, Amazon), how many people go in depth with those features versus some massive team of developers (e.g. a government agency)?


Monday, April 14, 2008



Now has a its own location. I'll still post of goings on, but will keep the external site as a source of updates and documentation. There's a form there for feedback as well.


Tuesday, April 08, 2008

From Komodo to Wings


I started my IronPython musings with Notepad++ since it's my usual text editor (after being enamored with Textpad for many years, I moved on - lots of reasons, perhaps I'll blog them later).  As time passed and my scripts grew in size I started using an old version of Komodo I've got which has decent support for Python. I used Komodo (3.5) when I was writing a lot of Perl and it worked pretty well for that, especially since it had a good enough debugger and CGI emulator for me to dig myself out of holes.  Komodo does okay with Python but I've been using Wings IDE on recommendation from fuzzyman and it's much more responsive and fluid as a development environment.  Additionally, I got Komodo when there was still a cheapo version but now it's quite expensive - exceeding the Wife Acceptance Factor for experiments (WAFe).  Wings is quite reasonable, I'll start with the Personal edition and if I can get my company onboard I'll ask that they upgrade my license.

Unfortunately I still need to run my scripts outside the IDE environs since Wings isn't built with IronPython support (correct me if I'm wrong), but I'm quite used to this now. 


Monday, April 07, 2008

Programming Languages, Trends


DDJ (the print edition at least) has an interesting interview with Paul Jansen, managing director of TIOBE on languages. Some interesting points:

  • Direct quote: "I expect in five years time there will be two main languages: Java and C#, followed closely by good old Visual Basic."
  • While not much has changed over the last 10 years, COBOL is now out of the top 10 list and Python is moving in.
  • Direct quote: "Another langauge that has had its day is Perl. It was once the standard language for every system administrator and build manager, but now everyone has been waiting on a new major release for more than seven years. That is considered too long."

Interestingly enough, TIOBE has picked the language of the year to be Python.


Sunday, April 06, 2008

Wednesday, April 02, 2008

Generate Catalog Images in IronPython


What if you had a client who has a big product catalog you're integrating with the web.  You work out a convention for how data is displayed and want to generate some test images for yourself...

First, just some basics on generating a single image:

import clr
from System import *
from System.Drawing import *

def GenerateImage(text):
starter = Bitmap(1,1)
g = Graphics.FromImage(starter)
f = Font("Arial", 14)
theImage = Bitmap(starter, Size(200,200))
g = Graphics.FromImage(theImage)
g.DrawString(text, f, SolidBrush(Color.DarkBlue), 0, 0)

if __name__ == "__main__":
GenerateImage("Holly was a hunter")

Here's some TSQL for the sample data I'll use:

create table testdata(
testid int identity(1,1),
testvalue varchar(50)

declare @i int
set @i = 0
while @i < 25 begin
insert into testdata(testvalue)
values('PRODUCT ' + convert(varchar(2), @i))
set @i = @i + 1

Now for a finalized version that hits the database and generates images:

import clr

from System import *
from System.Drawing import *
from System.Data import *
from System.Data.SqlClient import *

def GenerateImage(text, path):
starter = Bitmap(1,1)
g = Graphics.FromImage(starter)
f = Font("Arial", 14)
theImage = Bitmap(starter, Size(200,200))
g = Graphics.FromImage(theImage)
g.DrawString(text, f, SolidBrush(Color.DarkBlue), 0, 0)

def GenerateImagesFromDatabase(connectString, query, sourceColumn, outPath):
cn = SqlConnection(connectString)
cmd = SqlCommand(query, cn)
r = cmd.ExecuteReader(CommandBehavior.CloseConnection)
while r.Read():
baseText = r[sourceColumn].ToString()
GenerateImage(baseText, outPath + "\\" + baseText + ".gif")

if __name__ == "
connect = "
Data Source=.\\sqlexpress;Initial Catalog=MVCBaby;Integrated Security=SSPI;"
sql = "
select * from testdata"
col = "
GenerateImagesFromDatabase(connect, sql, col, "


Pretty cool stuff, I hope it's useful to more people than just myself.  Download the first script here, the second more elaborate one here.