Invariant Properties

  • rss
  • Home

First thoughts with TDD

Bear Giles | March 30, 2011

I’ve been experimenting with test driven development (TDD) for the last few evenings on a small project generating OEIS sequences for a Project Euler problem. These are extremely early impressions!

Lesson 1: Take the time to write all tests first. This was hard. You can’t just write the junit tests – you have to implement enough of the ‘real’ code for those tests to compile, run and fail. It will only take a few minutes to actually implement the function… you think. Don’t do it!

Lesson 2: You still need ‘white box’ tests. My unit tests verified behavior. Almost all were passing, just a few stragglers that could be quickly knocked out.

Nope. It turned out that several of the tests should have failed but weren’t because my objects had an internal cache that had been populated by earlier unit tests. This is counterintuitive since the interface is very much immutable but memoization is required for decent performance. That memoization meant that I wasn’t testing the code I thought I was testing and that gave me some false passes. (Note to self: add code coverage tests earlier when using TDD!)

It’s easy to solve this problem but it requires knowing that you need to reset the object or create a new one before each test. This is good practice anyway but can be a bit of a pain when you’re testing common functionality at the abstract base class level. In addition you have to know how large any pregenerated caches are. These are implementation details that you won’t know when writing the initial TDD unit tests.

Lesson 3: It’s nice to know when you’re done. It’s basic psychology – it’s a lot more satisfying to tick off the failed tests and be done than to code a bit, write unit tests for that code, then code a bit more, then write some more unit tests, ad nauseum.

Comments
No Comments »
Categories
java
Comments rss Comments rss
Trackback Trackback

Hybrid Disks Matter

Bear Giles | March 24, 2011

I had a chance to actually use my new system last night.

Holy shit!

It’s impossible to say one system is X times faster than another since it depends so much on what you’re doing. Web browsing? Playing games? Developing and testing software? Formatting documents (with LaTeX or Docbook)? But you can look at snippets.

  • How long does it take to load a typical web page? (Well under a second.)
  • How long did a sample unit test take? (1 second on new system, 6 seconds on old development system.)
  • How long did it take to load MyEclipse? (perhaps 5 seconds on new system, over 30 seconds on old development system.)

Overall I think it’s fair to say that it’s at least 5 times faster than my old development system for the things I do the most, and perhaps 10 times faster than my old ‘lightweight browser’ system that I often did some light development on. Not enough for me to take on whole new types of tasks, but enough to make a very noticeable difference and help me keep focused since I don’t have just enough downtime to get distracted.

The key is the hardware. I took Joel Spolsky’s advice and looked at four main criteria for my selection:

  • It has server-grade hardware (quad-core Xeon processors).
  • It has a decent amount of memory (8 GB).
  • It has a fast disk (hybrid with 4 GB SSD backed by 500 MB conventional memory). I think this makes a huge difference.
  • It supports VME virtualization.

Joel also requires dual monitors I wasn’t as concerned about that since I don’t know where I would put the second monitor. The hardware supports it though.

All of this for under $1000 since I bought a refurbished server and put in a new disk.

I have one final thought. Like most developers I tend to run my own database instance. That takes up memory and CPU though, especially once I use a virtualized server instead of running it inside my desktop. Would I benefit from moving my servers to another system? With the right hardware I could even use the cloud architecture I mentioned earlier – it would be easy to bring up precisely what I want. I haven’t decided yet but it’s an attractive proposal.

Comments
No Comments »
Categories
Uncategorized
Comments rss Comments rss
Trackback Trackback

System names and incomplete thoughts

Bear Giles | March 23, 2011

Anyone who has multiple computers knows that it can become challenging to come up with new system names. I’ve reused several names for over a decade but in those cases it was actually the hard disk that was named!

(The way this worked was that at some point I did a clean install on new hardware but afterwards I either moved the old hard disk into a new barebones system (for faster CPU and more memory) or put a new hard disk into existing hardware. Either way there was clear continuity.)

Today? I just got a new (refurbished) system from Pacific Geek – a sweet little quad-core Xeon server with 8 GB of memory plus a hybrid SSD/hard disk from Newegg, all for under $1000. It will become my primary linux development system but my old system will still be around so I can’t reuse the name.

So what do I name the new system?

I’ve been using Shakespearean names for my last few ‘additional’ systems. Hey, it’s better than naming systems after the cast of Jersey Shore!

For some reason A Midsummer Night’s Dream seems like a good idea.

I was tempted by Oberon but decide against it. I’m not sure why, maybe because it’s so close to the computer’s model (Optiplex 960) and I don’t want to tie names to hardware model because it’s a real pain if you move the disk to a new system. I eventually go with one of the fairies – Mustardseed.

This morning I realize that I really screwed up. My subconscious wanted the fairies since one of the primary requirements for this system is that it support virtualization. Guest OS = fairy. Host OS = king of the fairies.

On the other hand my conscious mind knows that the guest OS will be single-purpose. E.g., it will run one version of a database. Not just postgresql, for instance, but postgresql-8.4. If I want to install postgresql-9.0 I’ll create a new guest OS.

From this perspective a full-fledged system name seems excessive. I can name it by the service it provides, e.g., ‘postgresql84’ or ‘tomcat70’.

There’s still the question of what to name the guest OSes. Do I stay generic and have problems if I want to have multiple instances? Do I include the host OS name and have problems if I migrate the instance to other hardware? Hmm… each has pros and cons.

Complicating this is the secondary goal of experimenting with clouds. What do I the guest OS if it’s born locally but migrated to Amazon? What if I create multiple instances in the cloud, e.g., I play with a single-node factorization engine but want to see what it can do if I create a small cluster on Amazon?

Comments
No Comments »
Categories
linux
Comments rss Comments rss
Trackback Trackback

Private Clouds For Development Environments?

Bear Giles | March 6, 2011

Here’s something I’ve been thinking about for the last year or so. No answers, just questions.

What does the typical small development team’s resources look like? I’m thinking of anything from a single person working on side projects at home to teams with a dozen or more members in an office.

There’s typically few overworked servers grudgingly provided by the IT group. The physical servers host numerous services:

  • continuous integration build server. We require one build server per supported OS flavor if we distribute the software – this can add up quickly. (Windows XP, Vista or 7? Redhat, Ubuntu or CentOS? Mac? 32-bit or 64-bit? What about smart phones?)
  • database server (for CI results)
  • web server for CI build
  • web server for daily build (testing)
  • web server for regression tests (where appropriate)
  • database server(s) (for web servers)
  • subversion server
  • bug/issue tracking
  • database server (for bug/issue tracking)
  • local maven repository (if you use maven)

Some of these servers are resource-intensive, others aren’t. E.g., subversion just doesn’t require that many CPU cycles or disk space. The maven repository might have a database backend but otherwise it’s also a low load.

The first approach everyone takes is to merge services. One database server shared by all applications. One appserver hosting multiple applications. That drops the requirements substantially, right?

In practice the services start bumping into each other. You can lose all services while doing maintenance on one. Occasionally a critical fix on one service will require a database bump that breaks another service.

The second approach is to migrate the less resource-intensive servers onto virtual machines. This gives good separation but comes with its own costs. Fortunately most of these costs are rapidly dropping (better processors, cheaper memory, better software) but you still have to establish requirements, set up the virtual servers, etc.

The third approach that I’ve recommended but never seen put into practice is to move some of these tasks into the cloud. Do we really need to have build servers on local hardware? This is a cheap way to quickly expand capacity – we don’t have to fight for capital budget since the cost of a cloud server isn’t that high if we only run it during working hours.

There’s a secondary benefit of standardizing our virtual servers. We don’t have to think about how much space to allocate, we don’t have to load the operating system ourselves, we just select one of a dozen standard configurations, grab an image, and we’re running in minutes.

This approach runs into a management roadblock that reminds me of the earliest days of the web. This is new and the risks are unknown from non-technical management’s perspective. We see large companies with thousands of clients, they see “TO THE CLOUD!” ads by Microsoft. (I still have no clue what they mean by that.)

This brings us to my current thoughts. Running virtual servers isn’t the only thing that’s gotten easier in the last few years – you can make your own private cloud with open.eucalyptus.com and other tools. You can also create hybrid clouds.

(Some quick definitions – a ‘public’ cloud is hosted by a company that offers services to the public – Amazon, Rackspace, etc. A ‘private’ cloud resides on a company’s own hardware. A ‘hybrid’ cloud is hosted on both types of hardware.)

This means it’s now easy to manage your development servers. It’s a few clicks to move a server from one node to another node if the first node gets too busy. It’s a few more clicks to move services to a new server if you get new hardware.

You can expand this idea – make every developer’s workstation a node. It has no available slots during the workday but can be used in the evening and weekend for additional tasks. E.g., perhaps use them for load testing from midnight until 6 AM. This would be time-consuming if you do it manually but with a cloud it’s a few quick cron entries.

Does this mean the servers will no longer be overworked?  Of course not – you’ve moved everything into virtual servers and added another service. Does that mean this is a pointless exercise? I don’t think so – I think it will be easier to manage standard virtual instances in a cloud than to manage the mishmash that usually develops. It will definitely be easier to bring up and shutdown servers as necessary, e.g., to bring up a legacy server for regression tests.

But as I said I only have questions, not answers. I think it’s worth pursing but don’t have hard numbers to prove how good or bad it is.

Comments
No Comments »
Categories
java, linux
Comments rss Comments rss
Trackback Trackback

Archives

  • May 2020 (1)
  • March 2019 (1)
  • August 2018 (1)
  • May 2018 (1)
  • February 2018 (1)
  • November 2017 (4)
  • January 2017 (3)
  • June 2016 (1)
  • May 2016 (1)
  • April 2016 (2)
  • March 2016 (1)
  • February 2016 (3)
  • January 2016 (6)
  • December 2015 (2)
  • November 2015 (3)
  • October 2015 (2)
  • August 2015 (4)
  • July 2015 (2)
  • June 2015 (2)
  • January 2015 (1)
  • December 2014 (6)
  • October 2014 (1)
  • September 2014 (2)
  • August 2014 (1)
  • July 2014 (1)
  • June 2014 (2)
  • May 2014 (2)
  • April 2014 (1)
  • March 2014 (1)
  • February 2014 (3)
  • January 2014 (6)
  • December 2013 (13)
  • November 2013 (6)
  • October 2013 (3)
  • September 2013 (2)
  • August 2013 (5)
  • June 2013 (1)
  • May 2013 (2)
  • March 2013 (1)
  • November 2012 (1)
  • October 2012 (3)
  • September 2012 (2)
  • May 2012 (6)
  • January 2012 (2)
  • December 2011 (12)
  • July 2011 (1)
  • June 2011 (2)
  • May 2011 (5)
  • April 2011 (6)
  • March 2011 (4)
  • February 2011 (3)
  • October 2010 (6)
  • September 2010 (8)

Recent Posts

  • 8-bit Breadboard Computer: Good Encapsulation!
  • Where are all the posts?
  • Better Ad Blocking Through Pi-Hole and Local Caching
  • The difference between APIs and SPIs
  • Hadoop: User Impersonation with Kerberos Authentication

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.org

Pages

  • About Me
  • Notebook: Common XML Tasks
  • Notebook: Database/Webapp Security
  • Notebook: Development Tips

Syndication

Java Code Geeks

Know Your Rights

Support Bloggers' Rights
Demand Your dotRIGHTS

Security

  • Dark Reading
  • Krebs On Security Krebs On Security
  • Naked Security Naked Security
  • Schneier on Security Schneier on Security
  • TaoSecurity TaoSecurity

Politics

  • ACLU ACLU
  • EFF EFF

News

  • Ars technica Ars technica
  • Kevin Drum at Mother Jones Kevin Drum at Mother Jones
  • Raw Story Raw Story
  • Tech Dirt Tech Dirt
  • Vice Vice

Spam Blocked

53,314 spam blocked by Akismet
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox