Spam spam spam spam

The internet seems to be a pretty noisy place. I remember an old adage about ‘What is the sound of the internet?’. It seems to have faded as there is no mention of it anywhere, but I’m sure it involved banging on drums and the yowling of cats. A glance through the log file of this web server displays a lot of noise. POST requests to non-existent URLs and garbage variables abound. Looking at the timing of these requests, they come from compromised computers without any human supervision.

It’s the same with email. There is a lot of junk being sprayed around the place by similarly compromised computers. It’s been a problem for a LONG time and no-one has found a magic bullet yet.

I first installed a brand new package called Sendmail on a brand new DG/UX server coming up 30 years ago. I no longer use Sendmail (no-one sane does, I switched to Postfix) and I still have nightmares about writing raw rulesets in /etc/sendmail.conf. I’ve administered a mail server ever since that time and I’ve noticed the changing patterns of the junk we all receive in our emails. The perpetrators of this computer equivalence to a plague of mosquitoes tend to have runs of spam, where they’ll set up or compromise a series of servers to relay their junk. One thing I’ve noticed about the origins of these runs is that they all tend come from IP addresses administered by the same people. A simple ‘whois A.B.C.D’ will tell you the admins behind any spam origin. How is that useful? If I see a run of spam, all from the same registrar, I’ll simply block ALL IP addresses from that registrar.

How does one do that? Years ago I found a great bit of software in QPSMTPD. It’s not a mail server, just the front end to one. It has a bunch of plugins, all written in Perl (a language uniquely suited to dealing with pattern matching). In a plugin subroutine hook_connect, which is called when dealing with an incoming connection, something like this…

my $cmd = "whois $remote_ip | egrep -f /etc/qpsmtpd/blocked_registrars | wc -l";
my $output = `$cmd`;

$self->log(LOGINFO, "$cmd : $output" );

if( $output == 0 ) {
    $self->log(LOGINFO, "$remote_ip has no dodgy registrar pattern" );
} else {
    return (DENY, "No access to hosts using with this registrar" );
}

If lots of people start doing this, It might force administrators to look after their networks a bit more and perhaps even service the email addresses they put in their ‘whois’ records.

If you think I can help with any of your email server woes, please contact me.

How many development environments do I need?

When I started out writing software, it was simple. You wrote your software and it got deployed elsewhere.

Now, with modern hardware and the complexities of modern software, this simply isn’t of much use anymore. Even if you are developing for one environment, say Android, you need to restrict the version and perhaps the hardware that you support. Web development is an order of magnitude more difficult and it’s critical that there is a clear path from what you write to that being installed on a web server (or client).

Lets work backwards. We have the server on which your software is being deployed. There needs to be another server environment, perhaps a VM on the same machine or another virtualhost for testing. This environment should closely mirror, or even be populated with, data from the live environment. In one system, once the backup of the live data was made, that data was then uploaded to the test server.

You’ll learn what is needed when you deploy from your development environment to this test server, minimizing disruption on updates to the live server.

Credit Cards: Other considerations

3D Secure is a system implemented by both VISA and Mastercard in many places in the world. It’s a system that is meant to add a later of authentication to the transaction, and also transfer liability from the merchant to the customer. That last point is a bit problematic. I’ve seen customers able to charge back their transaction, even with a order that has gone through 3D secure, so liability shift isn’t a given. Make no mistake, it’s worth having, but it’s worth keeping an eye on it too.

Another area worthy of looking closely at is the status of the transaction. Credit card gateways seem to love adding complexity to their systems, so you’ll find transactions that have been pre-authorized, authorized but not charged, tagged as potentially fraud, if you have address checking enabled you’ll find all sorts of inconsistencies. Expiry/CVC failures are common too. Mostly these are simply mistakes.

Finally, a card gateway should be able to handle brief communication interruptions. If there are any network issues during the payment status callback to your server, the customer has paid and your system hasn’t been updated. The credit card gateway should continue to try to update your server until it succeeds. Only a few gateways implement this, leading to customer frustration and increased support staff costs.

If any of this sounds familiar and you think I can help, please contact me.

Credit Cards: The callback

So the customer has entered all his/her details, the payment gateway has subtracted funds (or not) from their card balance. What now?

This is where the credit cards gateways start to differ. All the good ones have a configurable and secret URL where the card gateway server communicates the status of the transaction back to your server. This is critical. You cannot rely on anything being POSTed back via the customer’s browser, yet time and time again I see this. For maximum security, you’ll need to add the gateway’s IP address to an ACL for this URL on your server. The customer’s ‘Thank you for your order page’ (the one that the gateway has redirected the customer to, back to your shop) should do NOTHING but thank the customer for their order.

There are a number of subtleties regards the status of the order at each stage. You don’t, for example, wish to allocate stock to something that has yet to be paid for, even if it might be soon.

If I can help implement a credit card gateway system in your shop, please contact me.

Credit Cards: The handoff

So your client has added products or services to his/her cart, entered their shipping details and select to pay by credit card. We know that, at the very least, we are going to have to send an order number and an amount. Most interfaces require much more than that. In fact, the more information you can send, the better the processing gateway can determine mismatches and detect fraud.

How is the data sent? Usually a simple POST to an https page on the payment gateway’s server. Without any authentication, this is open to abuse. A cunning person could easily alter the amount POSTed, which might result in underpayment. Usually the data is encoded (note, not encrypted) with something specific to the payment gateway interface, to avoid abuse. Some payment gateway interfaces require a separate call (think CURL invocation from your server) to return an order specific page (i.e. one with GET variables specific to their order) that the customer is then redirected to. This isn’t really any different but does give the processor a bit more information about absent payments.

Once the customer has landed on the credit card payment page, the system now needs to wait for a result. This can come one of 2 different ways, the subject for tomorrow.

Accepting Credit Cards

So you want to add the ability to accept credit card payments to your shop?

There are a few things you need to consider. You do not want to be storing peoples credit cards in your database. Doing this opens you up to a great many negative things. Security breaches mean that these details can be stolen. It also means your server must comply with a stricter level of PCI restrictions.

So, unless you are a major player or are a credit card processor yourself, you want to offload this responsibility to someone else. I think that perhaps I’ve implemented more of these card gateway shop interfaces than anyone else, so over the next $however_long_it_takes I’m going to detail what a good one looks like. I might even give examples.

Next up, the hand off.

Harmonized Shipping Labels

Ever noticed those huge shipping labels on your parcels? Those are harmonized shipping labels. They are pretty standard and contain a lot of information about the contents, destination and sender. Recently, a customers standard delivery agents (the post office) required that their parcels being sent out needed a ‘harmonized shipping label’. Now there are quite a few different ways to do this. You can get your system to print out the labels, assigning a unique identifier for this package, then upload all this information into the mail system, or you can generate a standard file (usually a CSV) from your system, import it into the mail system and get THEIR system to generate the packing labels.

It doesn’t take a genius to figure out which is easier from the system point of view. Getting the post office to lay out the labels is much easier, up to the point where your sales volume increases to the point where it is worth going the whole hog, printing the labels yourself and uploading the resultant tracking data to the post office.

There are quite a few curlies along the way, including package aggregation, size considerations and much more.

If you think that I could help, please contact me.

Customer Support?

How much is your customer support time worth? These people are responsible for your repeat business and it makes sense to make their job as clear and as simple as possible. Give them clear guidelines and the tools to do their job properly.

If they are still using email to interact with your customers, you need to reconsider this. Email isn’t the most reliable of communication methods, delivery isn’t guaranteed and you open your support staff to security problems.

A recent survey of ‘trouble ticket’ systems left me underwhelmed. I did trial one that required tomcat to be installed on the server. The security implications left me uneasy (fears which were subsequently confirmed) and the system simply didn’t provide enough functionality. Your support system should integrate with your orders system, showing the customer full order history and previous interactions with your staff. It should give clear indications of levels of replacement stock, should that be needed, without the support staff having to exit and open half a dozen different screens.

In the end, I wrote one from scratch. It took me about 2 weeks. It was refined once in production, but it functioned from the start.

If this is something that you think I could help with, get in touch.

Too much data

Is your database slowing you down? Simply too much data?

Here is a tip. Create a top level integer attribute on the main offending table, call it or_year or similar. Create a non-unique index on this attribute. or_year can default to null. A daily update process can fill in this year attribute for data older than say 3 months and the query you are having issues with can have a ‘and or_year IS NULL’ added to it.

If this is the sort of thing you might need help with, feel free to contact me.

The Pareto Principle

From Wikipedia

The Pareto principle (also known as the 80/20 rule, the law of the vital few, or the principle of factor sparsity) states that, for many events, roughly 80% of the effects come from 20% of the causes.

This is true in business too. Around 80 percent of your profit usually comes from 20% of the customers. At the other end of the scale, there are customers you really don’t want repeat business from.

One client that runs a large worldwide shop, with a particularly difficult customer demographic needed some decent software to make sure some customers simply didn’t come back. James Spader would be proud of the result.

This blacklist shop module grew into quite a complex beast, fingerprinting the clients OS, looking at network blocks, using all available information, including some very fuzzy matching of postal addresses to come up with a single number that determines whether this client should have their credit card charged.

Of course once charged, we have even more information. We have an expiry date and the last 4 digits of their CC. Unfortunately, this isn’t always unique, even on a country by country basis, so we need to involve a human here.

If this sounds like the thing you might like for your shop, contact me.