How I became a hypocrite without noticing

Last week I realised I’m a hypocrite. I’ve skipped on the whole facebook fad, the twitter fad and so on primarily because I didn’t trust the owners to do the right thing with whatever data of mine ends up on their service. I’d convinced myself this was the case and from my initial scan of the facebook policies wrote off the whole service. Of course since then there have been many privacy violations/leaks to add fuel to the fire and the privacy policy grew to larger than the US constitution (currently it’s split into multiple pages so it doesn’t look so bad). Twitter just fell into the chasm of general social networking stuff. I say fad because I’d ignored previous social networking sites just fine, and right when I decided to go sign up, a new one showed up and everyone jumped at once to the new one, so I held off to see if it’d take off and kept waiting again. Well perhaps facebook was third times the charm, it hasn’t passed yet, and I’m right at that point again. Expect the next big thing any day now.

Well last week I realised I’ve got this all wrong. I’ve already sold my soul to the cloud and have ignored it like a blind spot. Gmail.

I started using gmail when spam became intolerable and spam assassin stopped working enough for me. Maintaining my own email server became too greater maintenance for not enough gain. Gmail’s spam filter is excellent and just works. I like things that just work. A decent web mail client wasn’t around at that stage or not for free. The ajax’y ones weren’t that good yet and gmail’s was again, excellent. So I did it, I switched. I even put up with having to forward all my mail to a single email account and handling it all through there. I also put up with the from header being set to it, exposing the actual account name which I never liked. Not a perfect solution but it worked. 80/20 rule be gone, this was more 95/5. Initially it was a tactical solution which just never got changed.

Then when google apps came along, even better, my whole email solution could be google hosted, however I still used forwards to the master gmail account (because the google apps account wasn’t a full featured google account at the time) and I used other services. Since then google apps accounts are now 99% full featured (95/5 again) so I moved all my data from my gmail google account to my apps google account, and all was happy again. My from headers are right now, still no spam, it just works.

However by this stage I had already been sucked into the cloud. As much as I don’t trust google (or trust them only slightly more than facebook/twitter) I do use it for important stuff - my email. The contents of my account is important, and due to this I was backing it up for a long time. Though not for a while now. When I moved between accounts I realised the backup tools all suck - hint: they don’t backup everything and they battle to restore everything if restore is even an option.

From the privacy perspective; Email can be faked easily enough, or lost in spam, so the level of trust/denial is quite low anyway, just like sms. Being one in billions on the service helps you to blend in.

I believe eventually everything posted online will become public. Through poor information security, coding mistakes, disgruntled/curious employees, cracking and so on. If you’re hiding information behind a sites privacy settings to keep something private, it’s just a time delay lock, not a safe. It’s happened before and it’ll happen again, information leaks. Don’t trust anyone.

So the answer is to either to play with fake information even if it limits the usefulness of it in the first place, or not play the game at all.

Right now I’m toying with two courses of corrective action:

  1. Wipe the slate clean and start a new fully embracing the cloud and what it provides. AKA the whole hog.

  2. Pull back the error of my ways and figure out a new way forward/around. AKA the long way.

I’ve ignored the middle ground; Combine existing with the rest of the cloud but not embracing by using fake details; because that dilutes the potential and raises more questions than it answers.

Or simply continue on as I have been and ignore my internal monologue’s screaming. That’s the do nothing option.

Mental note: must not forget to include a random image.

memes - Philosiraptor - Private Investigators

<3 the philosiraptor. As much as I like things that just work. It’s getting harder. And no I haven’t bought a mac, yet.


Learning Python

Python is an interesting one. My initial plays with python (as a language) were fine, it seems ok to me, it’s at least as productive as php and once more fluent probably more so. I wasn’t sure what to think when I saw the section on IEEE floats in the tutorial. Having done compiler and OS design, I understand what’s going on, and thought it was interesting that the float type in Python actually is an IEEE float, not a nicely wrapped up other type.

It all stems from infinitely repeating fractions, in decimal 1/3 can be approximated as 0.333 with more 3’s to add more precision but it’s still an approximation. In binary (1/10)(base10) is an infinitely repeating fraction and can be approximated as 0.00011001100110011(base2). The tutorial specifically showed some examples, namely a comparison (0.1 + 0.2 == 0.3) will return False, because 0.1 + 0.2 = 0.30000000000000004 (approximately).

I’ve never come across this in any other language (C# at least, not deliberately or accidentally), for a high level language that seems like a low level problem that at least half the users wouldn’t ever understand if they ever come across it. On the other hand, the fact that Python caches the compiled objects automatically and other bits seem to be lower level than other scripting languages seems to suggest high performance is at least possible, just beware of any built in limitations. Avoid the potholes.

Ignoring that, I do feel the class/OO stuff seems tacked on. Class methods have a required first parameter, usually named self, which is the instance variable for that object instance. When calling the method you don’t supply it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class Zomg:
    def __init__(self):
        Zomg.sup = "static variable"
        self.lol = "instance variable"
    def set_class_variable(self, newvalue):
        Zomg.sup = newvalue
    def set_instance_variable(self, newvalue):
        self.lol = newvalue
    def print_them(self):
        print("instance lol: {0}\nclass sup: {1}".format(self.lol, Zomg.sup))

Then to call it

1
2
3
z = Zomg()
z.set_instance_variable('blah')
z.print_them()

Static variables are accessed via the class name rather than self, so Zomg.sup would be static across all instances of class Zomg.

To access a method on the object you need to call it on object self, so self.set_instance_variable(‘blah’) rather than just set_instance_variable(‘blah’) even from within the object.

At first the scoping rules caused me some grief, I’m not sure why now. It’s very similar to php.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
class Zomg
{
    function __construct() // prior to 5.3.3 could be named Zomg too.
    {
        Zomg::$sup = "static variable";
        $this->lol = "instance variable";
    }
    function set_class_variable($newvalue)
    {
        Zomg::$sup = $newvalue;
    }
    function set_instance_variable($newvalue)
    {
        $this->lol = $newvalue;
    }
    function print_them()
    {
        echo "instance lol: ".$this->lol."\nclass sup: ".Zomg::$sup;
    }
}

Then to call

1
2
3
z = new Zomg();
z->set_instance_variable("blah")
z->print_them();

It’s actually very similar, only the method signatures don’t need the $this pointer, it’s automagic!

Whereas C# which has mature OO (right, at least it was built in from the start) would be like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Zomg
{
    public static string sup;
    public string lol;
    public Zomg()
    {
        sup = "static variable";
        lol = "instance variable";
    }
    public void set_class_variable(string newvalue)
    {
        sup = newvalue;
    }
    public void set_instance_variable(string newvalue)
    {
        lol = newvalue;
    }
    public void print_them()
    {
        Console.WriteLine(string.Format("instance lol: {0}\nclass sup: {1}",lol,sup));
    }
}

And to call

1
2
3
Zomg z = new Zomg();
z.set_instance_variable("blah");
z.print_them();

Obviously a bit longer because it’s not dynamically typed and need declarations. But I feel the syntax is cleaner by not needing to qualify everything to either the class or instance scope. On the other hand, forcing you to qualify it makes you realise all the time which it is, and you won’t accidentally create a local variable which masks the class/instance one.

Python lets you alias everything too.

1
2
zee = Zomg
z1 = zee() # will also creates a new instance of class Zomg

Just to confuse us, you can alias all variables/methods/classes (probably great for obfuscation contests)

Python has built in support for sqlite which is great, but out of the box didn’t have mysql support. I found what looked like a good mysql connection driver, and it’s about to have 2.0 released which is a total rewrite of the 1.2 branch. Ummm what? Fortunately it seems that’s the one everyone’s using and a package exists in Fedora for it; MySQL-python (not the most obvious name either considering other similar/related package names; mysql, mysql-server, php-mysql, python-cheetah, python-yenc). Once that was installed it was available.

As for web development in Python, I thought the distraction was going great and it was until I made the switch from mod_python to mod_wsgi. I’m sure I’m missing something, but a lot of googling and reading suggests otherwise. Basically Python3’s all strings are unicode thing breaks wsgi specification, or at least is not an agreed on behaviour so it can be unpredictable (see below). I also hit initial struggles with python3 vs python2.7 on my machine, and mod_wsgi is linked to one version only which I didn’t realise until later and I was writing python3 code which is not 100% compatible with 2.x.

Unpredictable: Specifically; reading in the request body for some reason relies on using read(x), with x being content-length derived from the request header. Where I come from we don’t trust anything from the client, so relying on that is bad joo joo. It was further complicated by the file handle for the input stream being an ascii type stream, not unicode, so had to cast it to unicode or couldn’t use it in normal string functions. Oddly enough, parse_qs (which is built in to parse querystring format strings) couldn’t take ascii and did want unicode. WSGI is like cgi in many ways (which is fine, just wrap up the complexity and it’s ok), but dodgy things like that annoy me. I shouldn’t have to do something like (pseudocode); namevaluedict = parse_qs(req.input.read(req.headers["content-length"]).to_unicode()) In the end I realised if you screw up the content-length header it doesn’t matter. Too small and it’ll just truncate (fine), however too long and read(x) will block until a timeout. It doesn’t matter because php does the same thing, so I stopped caring laaa laaa laaa. Ignorance is bliss.

The major hassle I came across was also with web apps. It’s a cross between a high level language and the low level (essentially) cgi interface.

Though all I really want in a web language is (distraction/sidebar):

  1. easy handling of get/post form variables
  2. form file uploading handled some way (pluggable ideally)
  3. cookie handling or custom set/get of headers
  4. performance options - around threads, connection pooling, caching and so on. Persistent application server process perhaps.
  5. generic ish database access layer is nice too

I draw the line at templating engines and so on. Ideally the developer knows how best to render the content, if it’s a small output a template engine could be too much overhead and you just want to serve pages fast. All the other framework provided services (MVC most often) I feel have way too much overhead usually. It’s great if you want it all OR if your prototyping (faster development, more productive, code reuse etc), but if you don’t need many of the features it’s usually hard to cut them out. My reason for this is I like simple cut down apps. No frills, just get the job done, simply and fast. Less code means less code to maintain. I’m not a fan of the black box. Especially when what comes out is not what should, and you can’t step into it to debug. This is why on php I’ve been using POG, for simple database objects, but even there I found some odd behaviour. Under php 5.1 I had to manually escape colons in strings going to the database, under 5.2+ I didn’t.

I’m sure I’ve over thought this, and the simple app I was working on has taken far too long so far, however I have learned a lot of the ins and outs of python and the many ways to deploy it.


Powered by hampsters on a wheel.
Built with Hugo
Theme Stack designed by Jimmy