Tuesday, July 31, 2007

Fast Learning, Norvigian Thinking

{

I was looking over Justice Gray's post on becoming a better developer in 6 months and noticed that he had listed Freidl's Mastering Regular Expressions as a one week reading project. I'll preface my comments by admitting I am not the quickest read - unless I'm reading something I don't particularly care about I tend to go it slow no matter what the subject matter.

However.

Not really thinking of connecting the dots I checked in on Jeff Freidl's blog and digging around saw it took him 2.5 years to complete the first edition of the book.

It strikes me as funny that there would be such a disparity in creation and consumption. Of course I may be slow - a big part of my wanting to learn Perl was the desire to have the capacity to "think" in regular expressions - but over the last few years as I've gotten better and better I find it hard to think I could compensate the little projects and tools I've written without the pain and grit of using what I read slowly.

But then again we can all rest assured that there's no rush.

}

Sunday, July 29, 2007

Remove Duplicate Lines In Python

{

I had posted about the set operator in Python with some questions. All that changed today when I wrote a little script to remove duplicate lines from a file. The set operator takes a list and automatically gets rid of duplicate items. Very useful for situations like this:


#!/usr/bin/env python

f = open("c:\\temp\\Original.txt")
f2 = open("c:\\temp\\Unique.txt", "w")
uniquelines = set(f.read().split("\n"))
f2.write("".join([line + "\n" for line in uniquelines]))
f2.close()


}

Saturday, July 21, 2007

Shuffled Arrays

{

One of my weaknesses is that I love puzzles. And once I'm puzzle solving, I usually dwell on the problem beyond its worth. I recently saw a job ad - I'll leave this post disconnected - that had a quiz associated with it. The quiz amounts, basically, to shuffling items in an array (javascript).

My first stab was intuitive, but I wonder if it's the most optimal because it relies on a lot of discarded data. In my loop generating a random order, I essentially go through an undetermined amount of times discarding results that already exist in the randomized array. Additionally the array with random numbers is just for positioning and is probably unnecessary.
Here is the code:


function BuildArray(){
// just builds a random array to work with
var testArray = new Array();
testArray.push('Test 0');
testArray.push('Test 1');
testArray.push('Test 2');
testArray.push('Test 3');
return testArray;
}

function GetShiftOrder(testObject){
// this is what sorts things out
var upperBound = testObject.length;
var newOrder = new Array();
var shuffledArray = new Array();
builder:
while(newOrder.length < upperBound){
n = parseInt(Math.random() * upperBound);
for(i=0;i<newOrder.length;i++){
if(newOrder[i] == n){
continue builder;
}
}
newOrder.push(n);
shuffledArray.push(testObject[n]);
}
alert(newOrder + '\n' + shuffledArray);
}


I was thinking about this today and it may be more effecient to make a random number of passes at the array swapping pairs of items. A shuffling cards approach may be slightly more effecient although to have meaningful swaps there would need to be a minimum number of passes - and additional complexity with making sure pairs were swapped in random pattern.

Anyway, it was interesting and I'm always curious about more elegant solutions.

}

Thursday, July 19, 2007

CodePress

{

I'm taking a serious look at CodePress. Very, very, very cool.

}

Sunday, July 15, 2007

Sieve of Eratosthenes, Python, Set Operations

{

I pretty much broke down after TechEd. I thought I'd be patient enough to wait for Ruby but because Python is most mature I have started to learn it, whitespaces or no.

The first thing I did was check out Guido van Rossum's tutorial for programmers, which was an excellent first step. I've followed that up with some random programming - a lot of fun so far.

I was wondering about the set operations in python and how that made a difference in programming since there aren't syntactical equivalents in .NET. Incidentally at the time I was reading The Man Who Loved Only Numbers and ran across a short description of the Sieve of Eratosthenes as a way of finding primes. I thought it would be a good way to check out the set operations of Python.

I wrote the following:


from System import *

def multiples(num, thresh):
multi = []
for i in range(2,thresh):
m = num * i
if(m < thresh):
multi.append(m)
else:
break
return multi

primeThresh = 5000
print DateTime.Now.ToString("MM/dd/yyyy hh:mm:ss")
nums = range(2,primeThresh)

for n in nums:
nums = set(nums) - set(multiples(n, primeThresh))

print nums

print DateTime.Now.ToString("MM/dd/yyyy hh:mm:ss")


Interesting, but slow. The way I may do something like that in C# (which also works just fine in Python) would be this, which I wrote later:


from System import *

primeThresh = 5000
print DateTime.Now.ToString("MM/dd/yyyy hh:mm:ss")
nums = range(2,primeThresh)

for n in nums:
for m in range(2,primeThresh):
try:
mult = n * m
if(mult > primeThresh):
break
nums.remove(mult)
except:
pass

print nums

print DateTime.Now.ToString("MM/dd/yyyy hh:mm:ss")


It's a lot faster than the previous approach which makes sense - doing set algebra on large sets should take a long time... but that begs the question: are those set operations dangerous (ie. so slow as to be costly).

I'm wondering what's "pythonic" and how a jedi would write this most effeciently.

}

Saturday, July 14, 2007

Being Better

{

I probably won't be officially "tagged" but a meme is going around about what one would do in the next 6 months to become better. Hanselman, in his podcast, spoke of a few things and the responses seem to be going up around the blogs.

How will I be better? Most of the things people mentioned are things I already try to do: reading technical books, working on my software, looking at open source, training others... but one thing I want to do is to start posting code.

I write a lot of code and much of it doesn't make it here because I'm sheepish about looking foolish. But getting better is about having courage to show my work because without that, how could it get any better?

I'm going to start with my JAPH, a program that prints "Just Another Perl Hacker." It's meant to be novel and I actually like mine since it leverages some language features that don't exist in my mainstay, C#. The map operation can visit each item on an array performing some opertion - in this case I'm using a regular expression on each item and grabbing alternating characters! I haven't been writing perl for a long time so hopefully over time I can claim a better one but at least this is my own thinking at play, destined for improvement. I did post it on usenet (a first version which had some foolish mistakes) but this is after a little bit of massage:

#!/usr/bin/perl -w
# David's JAPH, volume 0.1
map(print(/(\w)\w/g," "),
qw(Jaubsctd Aenfogthhiejrk Plemrmln Hoapcqkresrt));


The original JAPHs, from Randall Schwartz, were not really about obfuscation but more about language features. The use of map, regular expressions, and $_ in the above are what make it interesting to me.

}

JLam on DNR

{

I never mind a long drive with a podcast. A few days ago I listened to John Lam on Dot Net Rocks. I'll admit that sometimes the DNR people can be annoying to me (think Richard Campbell saying over and over like it's a joke "Managed JavaScript??") but it was all worth it to hear some of the internal goings on with the Iron Ruby project.

Gleaning: Iron Ruby is a ways off. I got somewhat impatient and have been using IronPython in the meantime, but more on that later. That's not a small decision for me because I am of the Norvig perspective on learning a new language - it takes the length of time needed to think in terms of the idioms of that language, as opposed to just writing your code with different syntax and keywords (ala writing C# in Ruby).

Gleaning: Ruby is a powerful language for creating domain specific languages. That meme has been floating in my head for a while - listened to a good Software Engineering Radio podcast on the topic sometime back.

Gleaning: JLam talks a bit about how the learning curve for new languages these days revolves around the frameworks for languages - I think the idea of Ruby combined with .NET is that the .NET libraries will fill a gap that Ruby has always missed in library support. Dangerous or brilliant? In other words, I'm sure the open source crowd (is/will be) up in arms because this will, in their minds, dilute the community effort behind a language. Brilliant because the Mort you know (who is probably your boss) will have a better level of confidence about using it.

Gleaning: Iron Ruby is implemented in C# but Managed Javascript for Silverlight is a VB.NET implementation!

}

Monday, July 02, 2007

True Story: The Cat Ate My Source

{

A security issue prompted my ISP to change passwords for all users. Normally, this wouldn't be a problem - I'd go to my repository of code, divided at present by year in a "Code05," "Code06," "Code07" format and update a constant in my Constants class and *presto* the connectivity would be back.

Problem is, what if I have to go back more than 3 years?

I'm usually not that irresponsible. In fact, I'm fiendish about backing things up. Yes, I use source control, even on my own little-bitty projects that matter to no one but me. But my mistake was using an external drive - a flakey one - which died after I'd had to reformat/reinstall. I'd backed everything up to it, and my data was gone.

Edge case since I'd after the reinstall I'd moved what was important back to the machine. Edge case because it was in a "code archive" that was something like 6 years old - code that was so long forgotten it was like a head bludgeon when I got a phone call from the client saying that their site was down.

After panicking and looking for some accidental backup (why do we look for things when we know they're gone for good?) I had an epiphany.

I downloaded the *.aspx pages and the site dll - I then used Reflector to disassemble and - I kid you not - 15 minutes later the client was up and running.

And yes, I kept the source...

}