random complexity

with a side of crazy

Posts Tagged 'project'

Project week one wrap up

So recently I took a whole week off to work on personal projects. I started by creating a list of things I wanted to achieve in that week, mainly things I'd not quite got to in my normal downtime and then expanded from there. Despite trying to not let this list grow too big - for fear of building an insurmountable task list - I managed to keep it to two main projects and a handful of smaller separate tasks.

The two main projects were deliberately not related, with the idea of being able to spend alternate days on each, or simply swapping when I got stuck or simply needed a change.

Unfortunately I never quite got started on these two projects but did make major progress on nearly everything else. These included rebuilding my file server, regular exercise, rebuilding my desktop pc, catching up on recent TV and a laundry list of random odd jobs around the house.

The file server rebuild deserves a whole post so I'll keep that separate.

The desktop PC was simply an overdue thing. My previous main desktop died (north bridge failure) probably about 2 years ago. Since then I used an Asrock Atom330 as a desktop which then became my media pc and then an AMD based desktop (which began life as a bitcoin mining box). The AMD machine had to run windows due to the linux ati drivers being an epic pile of crap.

So to cut to the chase, I replaced the whole machine;

I think the thing I was most impressed with was at idle the machine uses 40W and at 100%CPU and 100%GPU (CUDA) it was pulling 220W. Oh and I'm back on linux again too which is nice.

The graphics card barely fit in too, which was funny.

Gigabyte GTX670 OC

That'll do for now. The file server rebuild post is going to be huge!

Migrating to Git

I've been meaning to move from Subversion to Git for a while. Most people agree that Git is superior and now there are few if any Windows related compatibility issues. Windows compatibility for me is a must have because a large part of the code I write is developed on and targeted for Windows.

Historically I've used SVN with TortoiseSVN for the interface on Windows, and plain old svn command line too on Linux. My single repository also contained all of my projects, split out at the root by language; csharp, cpp, python, web and most recently solaris. This structure has served me well, as it allowed me to have shared libraries located out of a project directory in a common location. I also had some binaries in here for shared libraries of third party origin - log4net, mysql connector, NHibernate. Any project that also required files of binary only type were also present - icons, images and so on. Apart from these few exceptions I was not keeping binaries in the repository at all. I'd learned that from my CVS days.

I was running SVN on a remote VPS and working directly with that. This allowed me a common way to work on my own code from various locations as I could work directly with my repository. It was secured with username/password authentication and ran over SSL. This was backed up regularly to my main file server. So I had off site backup for the master repository which itself was off site to where I was working from.

When looking at Github it became clear that I needed to have all of my projects in separate repositories. This would make sense in the Git world because it appears you can't do a partial checkout. Under SVN it's possible to checkout a subdirectory of the repository and work on it like it's self contained. I used this when working with solaris code so I didn't need the whole repository checked out on that VM. It works fine. However under Git it won't.

Another obvious reason for reorganising my repository is the top level language directories. It made sense when most of my code was C++ or C# only as projects within there wouldn't likely use each other. I later added Web as a catch all for what remained of my web projects (primarily php4 based), and much later solaris was added to contain my NAS project (which is written in bash and python). It was separate to allow partial checkout. As time went on I also had other projects which broke this clear separation - my static blog code was written in C# but was ultimately a web project. Clearly the structure had outgrown it's purpose.

The end goal of the migration was to bring across my entire version history into multiple per project repositories. This way I could selectively publish a repository to the public if/when I wanted to. So the first step was to migrate into Git and then split out each project into it's own repository, with full history.

While looking for migration tools, I found many projects (most on github) named svn2git. Some were forks of one another and others were totally different. Several I tried didn't even compile so were quickly discarded. I ended up settling on a (sigh) ruby based one; svn2git.

After some initial failed attempts to get it to run on my VPS directly (private SSL cert issues and user auth) I spun up a Centos 6 VM at home, copied my SVN repo onto it and configured mod_dav_svn on there. From here I was able to work in isolation to fix up any issues before doing the real migration.

Ultimately the readme was correct, from a minimal Centos 6.2 installation with RPMForge configured I ran approximately the following. It's highly likely I did extra things not listed here.

    yum install git git-svn subversion mod_dav_svn ruby rubygems httpd elinks

I copied my repository on from a recent backup, into /data/svnrepo and then configured mod_dav_svn by adding the following to /etc/httpd/conf.d/subversion.conf

    <Location /svnrepo>
       DAV svn
       SVNPath /data/svnrepo

Started apache and browsed to the location to verify it was working.

    service httpd start

Don't forget to setup ssh key's for authentication with your remote site. This is so git won't prompt for a password when connecting to the remote site.

    ssh-copy-id -i ~/.ssh/id_rsa.pub remotehost

Now I was confident the SVN side of it was working and I was able to log into the remote site without a password. Now I was able to proceed with the Git side of it.

    gem install svn2git

I created the authors.txt file as suggested in the guide, into it's default location ~/.svn2git/authors

    robert = robert <email@here>

Then after some trial and error to get the settings working just right;

    mkdir ~/stagingrepo
    cd ~/stagingrepo
    svn2git --rootistrunk -v

This ran though and imported all of my revisions into a new git repository. From here I needed to split it out into new per project repoistories.

    mkdir ~/gits

To do this, I had to perform the following. Clone the repo and then use the filter-branch command to throw out everything except a subdirectory I choose, in this case a specific project. This will cause issues for projects that reference items out of their repository, so I have to be aware of that for later.

    mkdir ~/gits/project-x
    cd ~/gits/project-x
    git clone ~/stagingrepo .
    git filter-branch --subdirectory-filter csharp/project-x HEAD

And now the new remote server repository;

    remote$ mkdir /data/gits/project-x
    cd /data/gits/project-x
    git init --bare

    local$ git remote rm origin
    git remote add origin ssh://remotehost/data/gits/project-x
    git push --all

Now from my normal workstation I can simply clone the repo and continue working. Pushing to the remote site when ready.

    git clone ssh://remotehost/data/gits/project-x

The temporary repositories can be deleted as they aren't needed. That is the stagingrepo and the per project ones created to push up to the remote site. After the filter-branch step, that single project git repository is still as large a the original cloned one. I incorrectly thought running git gc on it would prune out the no longer referenced files but this is not the case. However the data that is pushed up to the remote bare repository only contains the files and history referenced. This is important to me because when I open source a project I don't want my other projects leaking out.

Once I'd figured out what I needed to do to achieve this, I was able to script up my list of projects to automate the migration.

Remote site script (to run in the directory hosting the git repositories, /data/gits/)

    GITS="csharp/project-a csharp/project-b solaris/project-c"

    for repopath in ${GITS}
        mkdir -p ${repo}
        cd ${repo}
        git --bare init
        cd ..

Local script for migration (to run in directory hosting the git temporary repositories);

    GITS="csharp/project-a csharp/project-b solaris/project-c"

    echo Performing first migration
    mkdir stagingrepo
    cd stagingrepo
    svn2git ${SVNBASE} --rootistrunk
    cd ..

    echo Now each project
    for repopath in ${GITS}
        echo "- ${repo}"
        mkdir ${repo}
        cd ${repo}

        git clone ../stagingrepo .
        git filter-branch --subdirectory-filter ${repopath} HEAD
        git remote rm origin
        git remote add origin ${GITBASE}/${repo}.git
        git push --all
        cd ..
        #rm -rf ${repo}
    #rm -rf stagingrepo

Then finally the script to clone these back down to the workstation

    GITS="csharp/project-a csharp/project-b solaris/project-c"

    for repopath in ${GITS}
        git clone ${GITBASE}/${repo}.git ${repo}

So there you have it. That's how I migrated from a combined SVN repository into 28 new Git repositories. As an added bonus, Git for windows comes with a bash shell, so I was able to use that script to clone everything down onto my windows pc.

Tomato is fruit

Ongoing Rationalisation and the Cloud

I find myself yet again at a fork in the road. After over a year of considering the ideals of minimalism and figuring out how to adapt that to my life I manage to find myself contemplating the epic "less is more" paradigm with some gusto.

The first starting point (I start often, finishing is all relative go the destination) was to rationalise the projects list. Over the years my projects list has moved through various incarnations;

  • paper lists (post-it's, loose pages, scrap paper, spiral bound)
  • text files (in various places; home, MP3 player, USB thumb drive, CVS, dropbox)
  • wiki page (private wiki)

and this moving around has a few interesting side effects. When you find an old todo list it's interesting to see what you've done and what you haven't. It's like a window into the past. Knowing where you've been is a good way to not go back there and repeat the past. Some projects hung around on the list for years with little to no real progress. Others which did get done only got done to the point of "scratching the itch" and nothing further. Once the joy or need of something has been met, it takes new motivation to take things to the next level.

It's not uncommon for items on the todo list to have other dependencies and for a long time having a development environment setup was a big one on this list. This would require a VMware machine to host it on, which needed hardware (money) and storage (disk, so money too). As this depended heavily on money to setup, it was a major blocking point for many of my personal projects. You could also say this was an easy excuse for not achieving. Over time some of these projects faded into the past, something where my interest never returned. Or I found another way of solving the problem the project was going to solve. Others where I had real motivation were worked on in other perhaps less optimal ways.

With me sorting out old files/notebooks as part of my clean up it's given me a slightly less blurry backward view of how I arrived at where I am now, at least going back 10 ish years. Using this as a filter I was able to create a current projects list to work from. This list includes short, medium and long term goals, and most importantly all should be achievable.

It's been on my todo list for some time to "cloudify my life" or at least investigate what, how and why the cloud ideas can benefit me, if at all. It seemed to align nicely with the minimalism ideals of throwing everything out and living on the net. However to avoid getting lost in the cloud you need to limit yourself to the various services that offer and provide what you need and don't go overboard.

There were a few places to start on that idea. Already being a consumer of some cloud offerings I needed to look at how I use them, how they work for me, what problems they solve and what new problems they brought. I also needed to figure out what I needed as a base and what of those would transition to the cloud. Certain base needs are obvious; email, documents/file store. Other's are less obvious; source code repository, backups, VPS.

So with those two starting points in mind I've been playing with and reading up a few different services out there to decide on what to try first. Necessity is the mother of invention, and experimentation is the mother of learning. Usually I'll rank a free service above a paid for service unless the paid for one is priced right or is that much better. Likewise open source is better than closed but the better product still wins. So to avoid this post getting out of control I'll just link to the services I'm looking at;

  • Dropbox I've used for a few years now, it's great and I've had no issues. The photo albums are easy to make and share with non-dropbox users. It works on my phone and you can keep portable apps in it. Free for 2GB and you can get more space by referring people (you and them get 500MB more). My referal link However this might have security issues and not be end to end encrypted as you can browse your files online.
  • GoogleDrive Quite new. Integrates Google Docs into it. It comes across as a Google reimplementation of Dropbox. Free for 5GB.
  • Google Apps Lets you run Google's services on your domain - gmail, google docs and so on. I've used for years now, primarily for gmail.
  • GitHub Source code hosting - only using Git. You need to pay $7/month for private repo's and then you're limited to 5. Public is unlimited. All the social aspects you'd expect from a web3.0 site.
  • BitBucket Source code hosting - originally only Mercurial, now has Git too. Free level includes unlimited public and private repos. Owned by Atlassian. All the social aspects you'd expect from a web3.0 site.
  • Backblaze Unlimited backups for $5/month. Great idea, simple UI. Aimed at everyone. My only issues are; data is not end to end encrypted (just point to point) and the data has to exist on your equipment to be maintained in the backup beyond a reasonable timeframe, 30 days I think, so no chance in recovering that file you might have lost 6 months ago. If they're ever compromised your data could leak out so it's not totally encrypted as you can do file restores to USB HDD.
  • Tarsnap Backups for 30c/GB/month (plus transit). Aimed at unix users, built by a security expert. Properly encrypted and secure. Runs on Amazone EC2 and stored in Amazone S3, so isn't cheap for a large quantity of data but for smaller amounts of sensitive or long term data it's seems fine.
  • MyCyclingLog Cycling tracking. I've used this extensively and it's pretty good at what it does. Unfortunately it's mostly manual entry although I did write my own bulk uploader.
  • Strava Cycling tracking. Better than MCL, integrates with GPS units perfectly. All the social aspects you'd expect from a web3.0 site. Privacy settings seem broken because it leaks your information to everyone by default and doesn't stop doing that when you change it to private.

Ultimately to replace my VPS I'd need decent source code hosting and proper off site backups. Then the only thing left on there is a (now) static web site which could be moved easily elsewhere. With a decent solution to cycling tracking I could replace Sports Track which I currently use on my netbook - however the other solution there is to not care about it, and live freely knowing I don't need to prove anything to anyone. Fortunately Gmail has already replaced my desktop email client, excessive spam, me not wanting to maintain email infrastructure and the always-on access-anywhere need drove me to a hosted solution. Gmail just happens to be awesome and easy to use on your own domain. Dropbox has already largely replaced the need for a usb drive to carry files around with me. A decent issue since my phone lacks a standard usb port - FFS Apple, 1995 called and they want their proprietary connectors back (actually worse, my original iriver mp3 player pre-dates the first ipod and it had a usb mini connector. Also don't forget the dock connector only showed up on gen3 ipods around 2003).

In the coming posts I'll blab on about these in more detail and what I've found out.

Copyright © 2001-2017 Robert Harrison. Powered by hampsters on a wheel. RSS.