TechByter Worldwide

Listen to the Podcast


10 Sep 2021 - Podcast #760 - (21:43)

It's Like NPR on the Web

If you find the information TechByter Worldwide provides useful or interesting, please consider a contribution.

PayPal

Subscribe

10 Sep 2021

Finding A Most Elusive Computer Problem

When your computer has a serious problem and you think you've found the solution, but then the problem re-emerges, there's only one choice: Keep looking. I thought I had found the root cause of an elusive problem in June after several weeks of tinkering.

Perhaps there's another option to continuing the research: Replace the computer and all the peripherals. That's not a good solution even if you can afford to do it. I described my primary computer's instant crash problems previously, in great detail, on the 25-Jun-2021 podcast. That's when I thought I had found the root cause of the problem, only to be proved wrong, and continuing to investigate until I was certain that I really had found and fixed the problem.

Wrong again.

The computer ran more or less normally through July, although it sometimes was a bit sluggish. In the second week of August, the immediate-power-off crashes began to happen again. And again. The computer wouldn't boot if the USB hub was attached. That's a problem I'd seen before and thought I had fixed by replacing what seemed to be a failed USB hub with a new hub.

It seemed clear that the problem was related to one or more USB hubs or devices, but which ones? So I detached everything except the most essential devices: A 4-bay disk stack, the mouse, and the keyboard. I also removed the computer from its dock, which meant that I could run only one monitor in addition to the built-in notebook screen.

 Click any small image for a full-size view. To dismiss the larger image, press ESC or tap outside the image.

TechByter ImageMy desk was suddenly messier than usual >>>

Because of the computer's location, I no longer had an Ethernet connection, but Wi-Fi was OK because the computer was less than three feet from the router. The external sound system wasn't available. I couldn't record audio or use any of the scanners. The desk was even more cluttered than usual, which is something I considered impossible.

But the computer booted quickly and seemed more responsive. Before one of the crashes, I had observed CPU temperatures in excess of 90°C, which is clearly in the danger range and approaching TJmax, when the CPU would automatically throttle itself back. With the peripherals detached, temperatures hovered around 50°C. Additionally, I no longer saw instances in which all 8 CPU cores reported 100% use.

TJmax (thermal junction maximum) is the temperature at which the CPU will slow down to avoid damage. For the Intel Xeon CPU E3-1505M v5 running at 2.80GHz, TJmax is 100°C. Any temperature over 80° is troubling and 90° or above is a clear sign of trouble. But what was causing this? And, I wondered, was the CPU reaching 100°C and, instead of throttling back, was the CPU just shutting down?

Reducing USB Connections

I took a few USB devices out of service permanently and freed up some USB ports by eliminating permanent connections for three USB ports used for cables to connect devices that are used only occasionally.

The computer (4 ports) and the dock (6 ports) provide nearly enough USB ports for everything, but I acquired a 4-port USB hub that increases the available ports from 10 to 13. The connections are set up this way:

  • Computer back #1: Western Digital daily backup drive.
  • Computer back #2: Mouse/keyboard A/B switch
    (B position is for Macbook Pro).
  • Computer right (back) #1: Orico 4-bay disk stack.
  • Computer right (front) #2: (Open for use by backup devices, thumb drives, and camera card readers.)
  • Dock #1 (always powered): Focusrite Saffire 6 audio.
  • Dock #2: Plustek film scanner.
  • Dock #3: Epson flatbed scanner.
  • Dock #4: USB extension cable that can also be used by backup devices, thumb drives, and camera card readers.
  • Dock #5: (open)
  • Dock #6: 4-port unpowered USB 3 hub.
  • Hub #1: Blu-ray R/W device.
  • Hub #2: DVD R/W device.
  • Hub #3: Audio headset.
  • Hub #4: (open)

I've also labeled the USB cables (see image below). It's all too easy to forget which cable is connected where and to accidentally disconnect the wrong one. Don't ask how I know this.

Some USB device seems to be the likely cause of the problem, but I have a lot of USB devices. Before being able to conclude the problem really was caused by one of the disconnected USB devices, I needed to let the computer run for at least a few days.

Day 1: No problems with the Sabrent USB hub disconnected. Using the computer was beyond inconvenient, particularly for applications that I had set up to use two monitors.

Day 2: No problems, but I needed to use some of the disconnected USB devices. The computer has a limited number of USB ports, two on the back and two on the right side. Late in the day, I disconnected the local backup drive and used that port for the Focusrite Saffire audio device to see if it would operate properly. It did.

I then ordered a 4-port non-powered hub from Anker. Maybe the powered hub from Sabrent was creating problems because the computer's USB ports are also powered. A well made hub shouldn't be a problem, but a non-powered hub wouldn't. I wanted to have the computer back in its dock on day 4 so that both monitors would again be available.

Day 3: I examined device drivers and Registry entries while waiting for something to go wrong (or not) and found device drivers and Registry entries for Acronis TrueImage Backup and some leftover references to Kaspersky applications. I had removed both of these applications because they caused operational problems, but components remain and are still running. Additional research would be needed to determine how to remove these components, but that wouldn't happen until the computer is back in its dock.

TechByter ImageTemporarily disconnecting the Focusrite Saffire 6 USB sound device, I was able to connect the disks used for weekly local backups and complete the usual Wednesday backups. Backups are essential, and this is even more true when there are ongoing performance problems.

By the end of day 3, the computer was back in its dock with a limited number of USB devices attached (directly to the computer or to the dock): A Western Digital drive for local daily backups, the Keyboard/Mouse A/B switch, an Orico 4-bay disk stack, and the Focusrite Saffire 6 USB audio device.

I began testing the computer in this configuration with the newly arrived 4-port unpowered Anker hub attached, but with no peripherals connected to the hub.

TechByter ImageDay 4: I connected Blu-ray and DVD drives to the new hub and problems did not recur. Then I added the audio headset, the film scanner, and the flatbed scanner to the dock. I also updated Google Backup and Sync to Google Drive for Desktop. This could be considered bad form by introducing new variables, but Google's documentation suggested that the new application simply added some features to the existing technology.

TechByter ImageThe computer continued to perform normally with the Sabrent 10-Port 60W USB 3.0 hub disconnected. Sabrent technical support wanted a video that shows the computer crashing. How could I do this? The computer was operating properly with the Sabrent device out of the picture and I had no desire to attach it again. Sabrent technical support seemed to be attempting to create enough frustration that the buyer simply gives up in disgust. Following a long series of back-and-forth messages and threats to involve both Amazon and the Better Business Bureau, Sabrent tech support passed the issue to customer service and the customer service representative immediately approved a refund.

“When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth.”
—Arthur Conan Doyle (The Case-Book of Sherlock Holmes)

Days 5, 6, and 7: There have been several false starts in diagnosing this problem, but it appears that the final answer is that the fault was with the Sabrent 10-Port powered hub. Previously, I thought that the device I'd installed a few years ago had simply gone bad. But the replacement device created the same problems, so logic suggests that the problem is deeper than a single failing device.

Is this the end? No, but it's close. I'll continue the tale in Short Circuits.

Short Circuits

Taking A Computer's Temperature To Research A Problem

Heat is the enemy of every electronic device. Too much of it can cause premature failure, and notebook computers run hot. That's simply a function of cramming a lot of small parts into a minimal amount of space and then using quiet fans (or no fans) to keep the noise level down. That's an issue I had considered in trying to sort through the primary computer's problems.

After nearly a week with no problems, ###CRASH###. The computer did it's immediate-power-off trick again. Two additional clues appeared: First, restarting the computer immediately always led to another crash, but normal operation ensued if I waited an hour or so. Second, the computer was very busy when it crashed. I was running an image backup of drive C when an automatic backup of my user directory kicked off. A lot of files from drive C were being copied to two different drives. That's a lot of work for the CPU to manage that activity.

Although CPUs should throttle back performance if overheating is a problem, a steep rise in temperature that occurs quickly might cause it to simply shut down. I had been concerned about heat before, but now it was my primary focus. Normally the notebook runs with the cover closed. That traps heat inside the case. The computer also sits flat on the desk, which might be holding in heat. The air inlet vents were clear and so were the fans, but the temperatures reported by Speccy were troubling. The computer also sits in a docking station made by the computer manufacturer, and the dock blocks some of the air inlet louvers.

I wondered what would happen if I ran the computer with the case open and if I elevated it slightly from the desk. With the case open and with the computer sitting on a Lap Works folding desktop device, the numbers from Speccy looked better and I wasn't seeing additional crashes.

 Click any small image for a full-size view. To dismiss the larger image, press ESC or tap outside the image.

TechByter ImageTechByter ImageSpeccy doesn't have an option for logging system readings, though and it's hard to work on one project while watching a monitoring application. The free Open Hardware Monitor shows more information than Speccy does and provides a logging option, so I downloaded it and ran it after using an aluminum laptop stand to place the computer about 5 inches above the desk. Those changes provided more opportunities for heat to escape from the top of the computer (1) and better air access below the computer (2). The computer's dock still blocks some of the air inlet louvers, and (thanks, Lenovo) I can't do anything about that.

Opening Photoshop CC had previously pushed all of the CPU cores to temperatures in excess of 90°C, but only one core reached 90°C with the case open and the computer not sitting on the desk. Better still, all four cores quickly returned to the normal range for this computer.

TechByter Image“Normal” for this computer is in the 70–85°C range — uncomfortably hot, but acceptable. There are still instances of core 1 reaching 95°C, which is just 5°C shy of TJmax. Perhaps it's been exceeding TJmax and that's what has been causing the computer to shut down. Core 1 running hotter than the other three cores may be an indicator of an incipient CPU failure. All four cores were running with average temperatures under 65°C during a test that sampled temperatures every five seconds for several hours, but core 1's maximum temperature was nearly ten degrees Celsius (18 degrees Fahrenheit) higher than the other three cores..

I'm hoping to keep the computer running until the current chip shortage is resolved and computers are shipping with Windows 11.

In the interim, I'm following procedures that I learned the hard way in the 1980s when computers crashed several times a day: Press Ctrl-S at the end of every paragraph when working with words, and following every significant change when working with any other data types. This ensures minimal data losses because even I can usually remember the last paragraph I wrote or the last change I made to a digital image or podcast recording.

Next week we'll have what I hope will be the final summary of this problem.

Poking Around In Windows 11

Some Windows users consider the next few months with equal parts of anticipation or dread as Microsoft prepares to roll out Windows 11. Only one of my computers will be able to run Windows 11, and it's already been updated. My primary computer and a secondary Windows system aren't eligible for the new operating system.

 Click any small image for a full-size view. To dismiss the larger image, press ESC or tap outside the image.

TechByter ImageWhat about your computers? We already know that Windows 11 has specific hardware requirements that a lot of current computers don't meet. If that's your case, you'll be able to continue running Windows 10 until 2025. Even computers that meet or appear to far exceed specifications for Windows 11 may be excluded by the computer's CPU. To find out if your computer's CPU is supported, check Microsoft's website for Intel CPUs or AMD CPUs. If your CPU model isn't listed, you won't be upgrading to Windows 11. To find out what CPU model is in your computer, go to Settings > System > About and you'll find the processor listed there.

I've been working with Windows 11 on a touch-screen tablet for a while. There's a lot that remains the same, some improvements, and a few changes that I do not like.

Spare Parts

Supply Chain Attacks May Endanger Home Users

You've probably seen the term supply chain attack in accounts dealing with gigantic data breaches. We first started seeing these attacks being discussed in 2017, but they're not new. They are, however, much more common than in the past.

Scammers who attack a single company can get data from that single company, sometimes with a lot of work. Smarter scammers attack companies that provide services for other companies. By ignoring the end user and concentrating on suppliers of software or hardware, they position their malware in a trusted chain. Insert malicious code into a popular program or piece of hardware, and you'll soon have access to data from many companies.

Sneak malicious software into an update file for a popular game or business program and -- right -- you'll gain access to data on thousands, or hundreds of thousands, of computers. The update has come from a trusted supplier and probably won't be scrutinized as closely as would be an untrusted internet download. Russia and China are both havens for scammers. Some are sponsored by the state and others are tolerated by the state so long as the scammers stay away from government resources.

The answer to supply chain attacks, whether you're a business or a home user, is backup. Services such as CrashPlan from Code42, BackBlaze, and Carbonite can retain original versions of files so that good copies can be recovered even if a malware application has deleted or encrypted all of the data files on your computer.

Will You Trust Facebook Virtual Reality?

Facebook, a company that isn't exactly known for its security precautions and often can't always manage to display a user's timeline properly, thinks users should connect virtual reality (VR) headsets to the social media platform.

Facebook Reality Labs already has a working model that would allow users to see a pass-through version of what an Oculus Quest wearer sees. Facebook aside, the opportunities for merchants of porn are astonishing.

Camera in the Oculus Quest 2 see the wearer's surroundings and this view on the headset's screen, along with the virtual components. Facebook says the pass-through technology could be used to show the wearer's eyes, allowing users to see eye-to-eye.

Ahhhhh. Pass.

Twenty Years Ago: Does Anybody Remember DEC?

That was my question in 2001.

The Digital Equipment Corporation (DEC) had been a big player in the 1970s and 1980s. A manufacturer of minicomputers, it offered solutions for organizations that couldn't tolerate the expense of running mainframe computers from companies like IBM. But then desktop computers killed the market for minicomputers and eventually DEC was bought by an upstart named Compaq. Then an upstart named Dell snagged much of Compaq's market share by selling computers for less.

More text from back then: Gartner Dataquest said the new company would own 18% of the market for personal computers, 31% for powerful workstations, and 41% for printers. But HP is already is the biggest seller of printers in the US. The acquisition makes little difference in that part of the market. The largest effect will be on the PC market. The two companies combine to account for more than 60% of US retail computer sales.

The new company would be be called Hewlett Packard.