Multiplying Spam

Rick asks: "For some number of months, I was receiving virtually no spam at all. About two months ago, the spam began increasing on a daily basis. Moreover, most of the spam is landing in the inbox instead of the spam box. Have new spammers come online recently? Also, I use AT&T/Yahoo e-mail. I noticed that the spam filter has a limit of 500 e-mail addresses. Do you know of a spam blocker that I could add to this e-mail service? Should I be looking at switching to another e-mail service provider, one that is not necessarily free?" The quick and easy answers are yes and yes. But you'll want more than that.

So here's the longer answer.

What you're experiencing is the general ebb and flow of spam. New spammers enter the "trade" every day. And although the average rat has more intelligence than the average spammer, some of the people who write the spamming programs are pretty sharp. The average spammer buys these programs hoping to make a fortune in the spam biz.

What happens is that new techniques get past the Bayesian filters for a while, but eventually the filters catch up. During the time that the filters are learning, spam volume increases. And the spam masters regularly add to their stable of compromised machines. As a result, new spam sources aren't yet listed in the various blacklists that some ISPs use to block the sludge.

Wikipedia has a good explanation of the process.

Particular words have particular probabilities of occurring in spam e-mail and in legitimate e-mail. For instance, most e-mail users will frequently encounter the word "Viagra" in spam e-mail, but will seldom see it in other e-mail. The filter doesn't know these probabilities in advance, and must first be trained so it can build them up. To train the filter, the user must manually indicate whether a new e-mail is spam or not. For all words in each training e-mail, the filter will adjust the probabilities that each word will appear in spam or legitimate e-mail in its database. For instance, Bayesian spam filters will typically have learned a very high spam probability for the words "Viagra" and "refinance", but a very low spam probability for words seen only in legitimate e-mail, such as the names of friends and family members.

After training, the word probabilities (also known as likelihood functions) are used to compute the probability that an e-mail with a particular set of words in it belongs to either category. Each word in the e-mail contributes to the e-mail's spam probability, or only the most interesting words. This contribution is called the posterior probability and is computed using Bayes' theorem. Then, the e-mail's spam probability is computed over all words in the e-mail, and if the total exceeds a certain threshold (say 95%), the filter will mark the e-mail as a spam.

Currently I use Spam Arrest, which costs about $50 per year, in front of my mailbox. Spam Arrest has a whitelist that I provided, a list of addresses that I always want to receive messages from. I have also provided a smaller list of addresses that I never want to receive mail from. Whitelisted messages are forwarded to me without delay; blacklisted messages are discarded without question.

Other messages, those from people I don't know, go into a quarantine area and the sender receives a challenge message. If the challenge message is undeliverable, the received message is placed in a special area that I won't see unless I ask to see it and it will be deleted after 30 days. The other messages are sent to an "Unverified" area that I look at several times a day. A quick glance at the subject lines is enough to tell me whether the message is spam or not, so the usual process involves a quick glance, checking "Select All", and then clicking "Delete".

When I receive a message from a new legitimate correspondent, I can approve the message. This sends it to my inbox and adds the sender to my whitelist.

As good as this sounds, there is a drawback. Some people are terribly offended by challenge messages; I can understand why. This is probably the last year that I will use Spam Arrest because the company that hosts my domain, Blue Host offers PostIni for $12 per year per e-mail account. I've tested this on my wife's account and it has proved to be surprisingly accurate following a relatively short (2-3 week) training period.

Increasingly, larger Internet service providers, and some smaller providers, are adding PostIni (a Google service) either as an included service or an extra-cost add-on. In either case, this is what I would recommend if you're looking for a better way to control spam.

After 20 minutes of digging through the various SBC, Yahoo, and AT&T Web pages, I've concluded that they don't want you to know much about the service. They may offer PostIni, but there's certainly nothing that clearly states so one way or the other.

It might be worth a call to SBC. With luck, you'll be connected to a technician who knows something (I estimate the probability of this at about 10%) within 30 minutes. I have dealt with SBC support several times and only once have I encountered someone who understood the question I was asking (and the questions have never been particularly difficult ones.)

So good luck with that.

Surrounded by Antivirus Programs

This is an unfinished story, a work in progress. I'm telling it now because it illustrates the frustrations of trying to find the right application or applications to protect your computer's data. For many years, I used AVG Antivirus because it was a light user of system resources. But version 9, released late in 2009 had begun to remind me of Norton Antivirus, the application AVG had replaced. I removed AVG and installed, over the next few months, several antivirus products. Some were free, others were free trials. Shopping for an antivirus program these days isn't easy.

It's important that you understand the limitations of this report: Specifically, I assume that most antivirus products from well known providers (Symantec, AVG, Kaspersky, Avast, Avira, McAffee, and the like) will generally catch most threats. Any particular application can miss a new threat and any particular application can raise a false alarm or send out a bad update. Over the years, this has generally proved to be accurate.

Some providers offer free versions in addition to their paid versions. In general, the free versions have shown themselves to be competent and reliable, missing only features such as an improved interface that comes with the paid version or some of the high-end protective features. Nearly all providers offer at least a 30-day free trial and it's a good idea to take them up on the offer.

But always remove one antivirus program before installing another. Two antivirus programs are definitely not better than one. Each will get in the other's way and each will think that the other is a virus.

Here's what I looked at and what I decided. Your conclusions may differ because your requirements are different from mine. For that reason, I'm omitting cat ratings for most of the applications.

Avast (Nope)

Click for a larger view.This application looked promising, but it came with several annoyances. Whenever it interacted with Microsoft Outlook, it popped up a notice to tell me what it was doing. I found out how to turn that off, but it should have been off by default. It also announces, in a loud voice, that it has updated the virus database. Beyond that, the interface was designed to look like a CD player. Just too cute. Ugh.

Avira (FAIL!)

Avira is a German application. I tried the free version. The download and installation went well, but then I couldn't get it to update the database. It would start and just sit there. In reviewing the Avira website support section, I saw that the problem wasn't unusual but the support staff suggested trying a manual update and clear instructions were provided. For some people, that eliminated the problem, but not for others. It worked for me.

G Data Total Care Looks Looked Good
(WAIT! Don't Do This!)

This is another German application and it includes an anti-spam filter that I like. My initial attempts to download program updates failed, though. The application told me that I needed administrator rights to perform the action. I have administrator rights, but the program must be run as "the" Administrator. This is something that will escape many users' understanding of how the system works. When I selected the program from the Start Menu, right-clicked it, and selected "Run as administrator", I was able to obtain the update. This is not acceptable.

Once I did that, though, future updates arrived without a problem.

I'm not a big fan of scanning the entire computer, but the application seemed to want to do that. I approved and the process ran for nearly 18 hours. I have to admit that the computer I use for testing has a lot of disk space (currently in excess of 2.5 terabytes) and those disks contain a lot of files (according to G Data, 1.8 million files.)

Based on the length of this section, you've undoubtedly already concluded that this is the application I've selected (but later rejected). I'll show you what I saw along the way, explain why I made the decisions I made, and tell you why I swept this application off my computer.

Click for a larger view.Let's start at the beginning, with the installation. G Data asks more questions and offers more options than many other programs. Generally I don't care for multi-purpose protective programs, but I decided to evaluate the full application and to enable most of the functions to see how much this would degrade the system's performance. Here I've agreed to allow the program to report malware to G Data.

Click for a larger view.The next screen was easy. I hadn't bought the program, so I didn't have a serial number. To continue, I installed the trial version.

Click for a larger view.For scheduling, I requested an hourly update for virus definitions, but declined weekly scans of the computer and I didn't select the backup option because I already have a reasonably well structured backup system.

Click for a larger view.At the end of the installation, I started a scan. The system scan and rootkit scan took only a couple of minutes, but I knew the system scan would take longer. I wasn't prepared for how much longer it would take. For now, I cancelled the scan so that I could finish the installation by updating the program.

Updates require registering the computer with the G Data server. Ostensibly this is so that the server will know what versions of the components your computer is running. In fact, this is probably a method to keep a user from buying a single license and using the application on more than one computer.

Click for a larger view.The definitions update concluded normally, but I noticed that the application itself had not been updated.

When I tried to obtain the update, I was told that I needed to have Administrator rights. I do have Administrator rights. Eventually, I tried running the G Data control panel from the Start Menu as Administrator.

That worked.

Trying to put off the system scan for a while, I took a look at the e-mail log and was impressed by what I found. G Data had correctly identified and marked most spam. There was a single false positive and correcting the entry seemed intuitive to me.

Click for a larger view.Here's the entry in my e-mail program. By default G Data will only mark the messages it suspects before delivering them.

Click for a larger view.I couldn't put off the inevitable any longer. Now it was time for the system scan. I expected that it would take 3 to 5 hours.

The system and rootkit scans completed quickly, as before. Then G Data moved on to scan drives C, D, G, H, Y, and Z.

As the scan continued, I looked at my e-mail accounts and was again impressed by the "out-of-the-box" accuracy.

Click for a larger view.But G Data seemed to be using a lot of system resources. Here it's using slightly more than 20% of the CPU. Later, I found that the default setting is for the scan to get relatively more resources so that it will run faster. The user can change that.

The scan also consumed nearly a quarter of the system's memory.

During the process, disk access was high. One drive was saturated most of the time and sometimes two drives were saturated, meaning that the device was running at 100% capacity. When this happens, the computer can do nothing else.

This disk saturation is not entirely the fault of G Data, though. Either a hardware conflict or a Windows misconfiguration causes process ID 4, which belongs to Windows, to consume enormous disk resources. I have seen this problem on this computer with Windows XP, Windows Vista, and now with Windows 7. It has proved to be a problem with an elusive solution.

Click for a larger view.After the scan had run for an hour, G Data predicted that the process would continue for another 23 hours!

Click for a larger view.As you can see, the G Data scan is using a lot of disk resources, but the real killer is the "system" process (PID 4) that's reading nearly 4GB of data per second and writing more than 1.5GB of data per second. When combined with G Data's 2BG per second of combined reading and writing, this activity is enough to stop the computer in its tracks.

Click for a larger view.I continued to monitor the process. PID 4, which is probably servicing some requests from G Data, continued to be the primary user of system resources.

Click for a larger view.Here's an interesting situation in which both drives C and D are fully saturated.

Click for a larger view.Previously, I had suspected Carbonite of consuming a lot of disk resources, but here Carbonite is deactivated and I paused the G Data system scan. PID 4 is reading and writing more than 35MB of data per second.

Click for a larger view.The scan didn't really take 24 hours. It finished overnight, while I was sleeping. In the morning, I found that it had identified several files. Some files had been incorrectly accused, though.

  1. These files all contain infected files. I had received some bad e-mail attachments and had retained them for future examination. It concerned me a bit that G Data had identified the entire e-mail data file that contained thousands of messages, but I had full access to the rest of the file. Only the bad files were off limits.
  2. This file is a configuration file that belongs to a program in the directory shown. I was not allowed to open or move the file and there seemed to be no simple way to remove it from G Data's clutches. Later, I figured this out.
  3. I understand why this file was identified, but I don't want it to be. It's a utility by Steve Gibson to test certain firewall shortcomings.

Click for a larger view.G Data needs to provide an additional option here. I can repair the file, move it to quarantine, or delete it. In this case, I wanted to tell G Data that the file is not a threat. There is a way to do this, but it's not immediately obvious.

Click for a larger view.With the scan complete, G Data now gave the a green light.

Then the Troubles Began: Why Is E-mail so Slow?

Retrieving e-mail is something I do frequently. Although G Data does a good job of marking and classifying spam, I decided that the feature wasn't worth the cost. Prior to installing the application, I could retrieve e-mail in 30 to 60 seconds. After installing G Data, downloading e-mail consumed 3 to 10 minutes.

Unacceptable. Totally unacceptable.

It was time to re-assess what I needed.

A New Way of Looking at Antivirus Applications

Magazines rate programs from "best" to "worst". Which products are at the top of the list and which are at the bottom of the list? Does it really matter? If you look at the results from the major reviewers, you'll see that the "worst" antivirus programs probably received a score of 80 to 85 and the "best" antivirus programs received a score of 90 to 95. I would interpret this to mean that the "best" programs would find at most 95 infections out of 100 and the "worst" programs would find at least 80 infections out of 100.

So the question is: How significant is this difference? And that led to another question: How many infections have you had on your computer? My count for the past 10 years is zero.

I've had perhaps 10 or so antivirus alerts in the past 10 years or so, but not one has been for a problem that the wetware between my ears wouldn't have caught on its own.

I believe in antivirus programs for the same reason I believe in Word's spelling and grammar checkers: I might miss something that the automated process will catch.

That might suggest to you that I believe the difference between 80% and 95% isn't terribly important. To some extent, that is correct. Actually, I believe that the difference is meaningless.

What is important is this: It's important that you find an antivirus program that stays out of the way enough that you won't disable it. When I decided that AVG Professional (paid) 9 got in my way too much, I removed it.

It the past several weeks, I've looked at Avast (which I considered to be more like a comic book than a computer application), Avira (which annoyed me with frequent full-screen advertisements), and G Data Total Care (which ran for a couple of weeks in test mode on my desktop system).

When I removed G Data from the desktop system, I was required to restart the computer and, when I logged on, I got a "temporary desktop" because Windows 7 said some important files had been deleted. It would be OK, the message suggested, if I restarted the computer again.

I did, and this time I was told that my copy of Windows wasn't legal—that I had to activate it. The activation was accepted and my familiar desktop returned, but this could cause quite a bit of distress for users.

For that reason, I recommend avoiding G Data.

I'm not trying to be argumentative or to suggest that anyone else has made the wrong decision by selecting some other antivirus program. I am simply explaining my criteria for making the decisions I have made.

Computer security is a lot like airline security: You can enable basic protection and depend on your own intelligence to thwart the bad guys or you can inconvenience yourself with procedures that make the system all but unusable and put your faith in window dressing.

So that brings me back to ...

AVG Free?
(I can't seem to learn my lesson.)

I installed the free version of AVG on the desktop, but it seemed to come with all of the resource-hogging features of the paid version, at least for the first 30 days.

So once again I removed AVG from the computer.

Although I might like to use an open-source antivirus application such as CLAM Win, I'm concerned that it offers no real-time scanning capability. And that led me to PC Tools Antivirus.

PC Tools
(DON'T EVEN THINK ABOUT THIS ONE!)

At first, I thought I had a winner here, but read on to understand why that's not the case.

Click for a larger view."PC Tools" is an antique name in the utilities business, but today's PC Tools is not the PC Tools of the past. Fortunately, the current owners of the name seem to have an attitude about utilities that suggests these applications should protect the user, but should also stay out of the way.

I installed it and initally my frustration seemed to be ended.

CatBut there is an irony here. I stopped using Norton Antivirus several years ago because it caused too many performance problems. Back in the days of DOS computers PC Tools and Norton Utilities were big competitors. More than a year ago, Symantec acquired PC Tools from its Australian owners. And, yes, you're right: Symantec also owns Norton Antivirus.

At the time of the acquisition, Symantec said that PC Tools would continue to be an independent business unit under the management of its Australian management team. So far, it seems that they have kept that promise.

I then upgraded to a trial version of PC Tools Internet Security, an application that includes a number of features that I do not consider essential, and system performance seemed not to be severely degraded. Collecting e-mail takes longer than it would without the e-mail monitor, but—unlike with G Data—the additional time is measured in seconds instead of minutes.

I thought that I could recommend PC Tools, but when I decided to test Bit Defender on the desktop machine, removing the PC Tools product corrupted the Registry. These things can happen and I considered it to be little more than a fluke. But then I decided to try Bit Defender on the notebook and, after I uninstalled PC Tools, the notebook crashed on startup. In this case, the Windows 7 Repair utility resolved the problem.

Granted this is a too-small sample to be meaningful, but when uninstalling an application causes a catastrophic system failure on 100% of the 2 computers tested, I cannot recommend it.

Even worse, when I installed another antivirus program on the computer, it displayed some extremely odd characteristics. Thanks to the Bit Defender support team (the following item is about Bit Defender), we found that parts of PC Tools were still installed.

The Windows uninstaller couldn't uninstall PC Tools because the application had been partially uninstalled, so I manually edited the Registry to remove all references to PC Tools from the HKLM\Software section. Then I found a PC Tools directory and tried to delete it. Some of the files were in use, so I needed to use a utility to delete them on reboot.

Once PC Tools was gone from the notebook computer, Bit Defender operated as expected.

Will I Ever find something I Like?

A security-minded acquaintance has been hammering Bit Defender for a long time, so I finally decided to give it a try.

It hasn't been on the computer long enough for me to decide for sure, but it's promising despite a couple of stumbles by Bit Defender support. I'll tell you more about it in a few weeks.

Bit Defender has a three-bears interface: too little, too much, and just right. Actually, I've been using the too-much interface and I like it a lot. Users can tell the application how much control they want and this ranges from very little (Bit Defender makes all the important decisions) to total control (the user makes all decisions). In the middle, the user gets to make the important decisions.

I'm not yet ready to say that Bit Defender is the right solution, but it's certainly promising at this time. Give me a month or so and I'll give you a more complete report.

All is not perfect in Bit Defender land. When I tried to install the application on the notebook computer, it blocked all Internet access, which made it impossible for me to activate the trial version, which meant the trial version didn't work, which meant that I could never activate the trial version, which meant that I could never have access to the Internet. Is this reasoning sufficiently circular to amuse you? I can assure you that it didn't amuse me and that I muttered numerous epithets in the direction of Bit Defender.

It really shouldn't be this difficult. If you have a network connection, there's a reasonably high expectation that you must might want to be connected to the Internet. Applications such as Bit Defender should protect you from bad things, but they shouldn't make it impossible to do what you want to do.

So I removed Bit Defender from the notebook computer, rebooted, and tried again. This time, Bit Defender showed me a dialog box that allowed me to specify that the wireless and wired connections were trusted. The registration still failed, but at least this time I had Internet access. Even so, both registration and the required update failed and, although I supposedly had Internet access through both wired and wireless connections, Internet access was blocked in both cases.

I was becoming somewhat less in love. Then I turned off the firewall and found that I could register. This is the equivalent of processed food from a male cow and, after a trouble-free installation on the desktop, I find it incredibly frustrating. It's almost enough to make me think that I could survive without any protection at all.

Almost.

I tried the Zone Alarm free firewall for a few days, but it was so intrusive and it blocked Homegroup access among the various machines, so maybe the solution is the Microsoft firewall (starting with Vista, Microsoft's firewall is actually pretty good) and Bit Defender for the rest of it.

Stay tuned. I'll be back in a few weeks with an update.

Short Circuits

Disconnected!

How important is the Internet to you? I was reminded on Thursday, January 14, how important it is to me. When I arrived at home from the office, I logged on to my account on the computer and noticed that all of the Internet applications indicated no connection. When I called Wide Open West, an intercept recording said, in essence, "Yes, we know."

Four hours later, I called Wide Open West again. This time I bypassed the intercept message and reached a real person.

I was told that it was believed to be a fiber issue and that there was no ETA for resumption of service.

No big deal, right? Well, maybe. Except that I had planned to work on the script for the next edition TechByter Worldwide and I needed the Internet to do some research (I wanted to confirm the dates when some computer CPUs were introduced.) There was some work from the office that I wanted to do, but that required connecting to my computer in the office. Without an Internet connection, I couldn't check my e-mail. I'm used to being able to check the forecast and current temperature in several cities.

Well, not that Thursday, buster.

At one time, Wide Open West had dial-up that subscribers could use when they were traveling, but I haven't had a list of those numbers for years. Besides that, my desktop system doesn't even have a modem in it. I could have used the notebook computer via a modem connection if I had a list of the numbers.

Just as well, I suppose; a modem connection would just have been frustrating.

But a fiber issue? For most of this week I've seen utility crews working nearby with what was clearly fiber gear. That puzzled me because the neighborhood had been wired for fiber ("fibered"?) several years ago. Given the amount of fiber that was installed then, I thought that the companies involved in the project had installed plenty of fiber.

Maybe not.

Oh, well. The cable outage gave me the chance to watch 3 episodes from the first season of Perry Mason and a biography of Pete Seeger called "The Power of Song". Maybe it wasn't such a bad evening after all.

By the next morning, most service had been restored.

Googling Broken China?

Many media pundits have not been kind to Google this week. The commentators have wondered how and when Google found its moral compass. They've said that maybe Google finally decided not to be evil. And those are the positive comments.

This could be interesting. Who is stronger, Google or China?

Google was criticized for "caving in" and abiding by China's laws, rules, and regulations regarding Internet use. Now, because of activities (possibly sponsored by the Chinese government) to break into G-Mail accounts of dissidents, Google is threatening to leave China.

This would represent a large financial loss for Google, but would it be a corresponding gain for the dissidents?

Many of them seem to think not. If Google pulls out of China, they say it would be a big victory for the Chinese government and a big loss for the dissidents. Better, they say, for Google to stay in China and work for change from within.

It's not unusual for perceptions to be wrong, and it seems to me that the general perceptions in this case are wrong.

Google: Stay in China.

4-8-16-32-64—Hike!

The first personal computer central processing units were 4-bit devices, quickly followed by 8-bit and then 16-bit processors. After a few years, the world moved to 32-bit processors and a few years after that, 64-bit processors were released. Except that the world didn't follow the the 64-bit leader (AMD was first, by the way). Now maybe it's time.

Until now I've stuck with 32-bit processors, but I'm now planning to move to 64-bit Windows and 64-bit Linux February. I'll do this even though I know that some of the applications I use won't work exactly right in a 64-bit environment.

How We Almost Got to 64

"Will you still need me, will you still feed me, When I'm sixty-four?" The Beatles sang those words. It's been a long journey to 64, but we're now approaching. It's been a long time since the first commercial processor was released. In 1971, Intel started selling the 4004, a 4-bit processor to a Japanese calculator company.

It's worth noting that 64-bit CPUs have been around since the 1960s. That's right. The Beatles were still together when 64-bit processors were developed. Granted, they were available only in supercomputers back then, but they started showing up in RISC-based (reduced instruction set) workstations and servers in the early 1990s. It wasn't until 2003 that AMD (not Intel) released the first 64-bit CPU for personal computers.

Where has 64-bit computing been for the past 7 years?

Let's go back for a moment to 1971 and the 4004 CPU. It ran at 0.74MHz and contained the equivalent of 2300 transistors. By 1975, Intel had released the 8008 CPU with 3500 transistors and the MITS Altair used it. This was the first commercially successful microprocessor kit. It was featured on the cover of Popular Electronics magazine in January 1975 and I wanted one.

In 1974, Intel released the 8080 CPU (2MHz with 6000 transistors). This was during the brief zenith of 8-bit computing devices. By 1978, the 8086 and 8088 processors (4.77MHz, 6000 transistors) were available and these are the processors used for the first IBM (and compatible) personal computers. These were nominally 16-bit processors, but with 8-bit choke points.

Then came the 80186, the 80286, the 80386, the 80486, and the Pentiums I through IV. These were followed by the "Core" and "Core 2" products. All 32-bit.

AMD64 was announced in 1999, but didn't ship until 2003. It was called the Opteron.

Intel, AMD, and Microsoft have always tried to maintain backward compatibility. This means that even antique 8-bit programs will run on 16-bit and 32-bit systems. That's not the case with a 64-bit system. Some of your older programs won't work. Some newer programs may have a few features that don't work.

But the time has come to move on. Apple did this with the latest version of its operating system. My Apple systems can't run the latest version of the operating system because it depends on features found only in Intel processors, not in the Motorola processors that power my systems. But it was the right decision.

My plans now are to convert to 64-bit systems in February, which puts me about 5 years behind the vanguard. Stay tuned: I'll let you know how it goes.