It is currently Mon Jun 01, 2020 1:57 am


Error Correction System for Hauptwerk?

Buying or building computers for Hauptwerk, recommendations, troubleshooting computer hardware issues.
  • Author
  • Message
Offline
User avatar

engrssc

Member

  • Posts: 6425
  • Joined: Mon Aug 22, 2005 11:12 pm
  • Location: Roscoe, IL, USA

Error Correction System for Hauptwerk?

PostSat May 23, 2020 11:24 am

Need to overcome the learning curve to make a sizable investment for an ECC system. Strictly from a not knowing enough about ECC begs the question if such a system could be added as a consideration in a future Hauptwerk version? Esp for use in a public performance situation.

An error-correcting code is an algorithm for expressing a sequence of numbers such that any errors which are introduced can be detected and corrected (within certain limitations) based on the remaining numbers. The study of error-correcting codes and the associated mathematics is known as coding theory.

As Drew mentioned, that the sound from the organ sounded like machine gun fire. Given today's awareness of terrorism directly at places like churches, a glitch like a ram stick issue could have serious panic possibilities, maybe? This leaves all public performance Hauptwerk installations vulnerable.

http://forum.hauptwerk.com/viewtopic.php?f=16&t=18662#p140586

I'm aware of many pre-built commercial organ problems, but never anything like this.

Rgds,
Ed
Offline

RaymondList

Member

  • Posts: 119
  • Joined: Sun Mar 11, 2018 3:46 pm
  • Location: North Carolina, US

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 11:35 am

I believe the ECC being discussed is in the computer implementation of the RAM modules. Errors within the RAM themselves are detected. It would not involve any of the applications the computer may be running.
Ray
Offline
User avatar

engrssc

Member

  • Posts: 6425
  • Joined: Mon Aug 22, 2005 11:12 pm
  • Location: Roscoe, IL, USA

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 11:43 am

Understand that. But I just solved a problem with an SSD error issue. What made it most difficult was that the problem was random and intermittent. Many computer components can create difficult to detect problems of course. Some type of error detection system similar in theory of checksum may not be possible, but certainly useful. I don't have enough computer knowledge to comment on this.

Rgds,
Ed
Last edited by engrssc on Sat May 23, 2020 11:51 am, edited 1 time in total.
Offline
User avatar

mdyde

Moderator

  • Posts: 12068
  • Joined: Fri Mar 14, 2003 2:19 pm
  • Location: UK

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 11:47 am

Hello Ed,

To add to Raymond's reply, yes -- ECC RAM is a kind of fault tolerance built into RAM boards that helps them to avoid memory corruption, rather like some RAID systems (e.g. RAID 1, etc.) provide a level of fault tolerance for hard-disks/SSDs. Those things would be implemented at the hardware level (or, conceivably, in the operating system) -- it wouldn't be appropriate to try to implement things like that in Hauptwerk. They need to be low-level (within the hardware or operating system).
Best regards, Martin.
Hauptwerk software designer/developer, Milan Digital Audio.
Offline
User avatar

engrssc

Member

  • Posts: 6425
  • Joined: Mon Aug 22, 2005 11:12 pm
  • Location: Roscoe, IL, USA

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 11:54 am

Thanks.

Rgds,
Ed
Offline
User avatar

mdyde

Moderator

  • Posts: 12068
  • Joined: Fri Mar 14, 2003 2:19 pm
  • Location: UK

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 12:00 pm

[Topic moved here.]

P.S. Although they could technically be implemented either in hardware or at the operating system level, even if done by the operating system (e.g. software RAID), it would still increase hardware cost anyway because you would need more of the underlying type of hardware (more RAM, or more disks, respectively), and implementing directly at the hardware level is likely to give better performance than would be possible at the operating system level.
Best regards, Martin.
Hauptwerk software designer/developer, Milan Digital Audio.
Offline
User avatar

engrssc

Member

  • Posts: 6425
  • Joined: Mon Aug 22, 2005 11:12 pm
  • Location: Roscoe, IL, USA

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 1:25 pm

The consideration is looking at possibilities for a public performance instrument. Presently RAM as well as SSD's are relatively not that expensive. Even CPU's are "affordable" given a person isn't doing a RGB gaming machine. I can and do overlook a few glitches at home as now my "audience" (as I live alone since my wife passed) is pretty much limited to my 19 year old dog whose hearing is diminishing some.

I did install a Raid system in my local church instrument.

A question, is a RAM failure more likely to occur when approaching the RAM's limit? I'm under the impression that this isn't the case. Admittedly, I only know the basics of RAM not many of the inside details. And sometimes a little bit of knowledge isn't the best situation. I never cease to appreciate the vast collective amount of knowledge available here. Again, many thanks, esp to Martin as well as many others willing to take the time to share. :)

A wild idea with no realistic basis - running two identical computers simultaneously? Similar to my Toyota hybrid Prius which has 2 engines with the computer "brain" figuring out which one is "online". Which, BTW is totally seamless. 8)

Rgds.
Ed
Last edited by engrssc on Sat May 23, 2020 1:40 pm, edited 1 time in total.
Offline

RaymondList

Member

  • Posts: 119
  • Joined: Sun Mar 11, 2018 3:46 pm
  • Location: North Carolina, US

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 1:35 pm

No, RAM is not more likely to fail when approaching a 'full' condition. RAM is very dependable these days. From my perspective, RAM (and all electronics) has two enemies. Heat and turn-on power surges. We NEVER would turn off anything we did not need to. The start-up power surge was a big enemy. I remember at one datacenter I worked at where they needed to do some electrical work one evening and told us to turn off our terminals (which had been on for years). When we turned them on in the morning, 7 out of the about 85 failed! For many years, keep it cool and powered was the main motto for longevity. Today, I think heat would be the biggest enemy, but again, RAM is quite dependable today.

Yes, all these error handling processes cost money, many time lots of it. I was always amazed when attending the "Morning Status Meeting" to discuss any issues that occurred during nightly batch processing when every once-in-a-while it would be announced that the Z12 mainframe had "phoned-home" during the night. It had detected some type of hardware issue, isolated it, and kept on processing with no apparent issues to user programs. It even ordered the correct replacement part (which was on a plane on the way here by the time of the meeting). But that costs LOTS! Enough of my rambling.

Regard,
Ray
Ray
Offline
User avatar

engrssc

Member

  • Posts: 6425
  • Joined: Mon Aug 22, 2005 11:12 pm
  • Location: Roscoe, IL, USA

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 1:49 pm

Very interesting, Ray. Definitely not rambling. Today it seems technology is progressing so fast, it's all but impossible to keep up (solely) with what's current. Why I don't save many software test programs, but use the portable versions.

Are RAM coolers of value? I ask that because on many mother boards, the RAM is mounted next to the CPU socket.

Rgds,
Ed
Offline

RaymondList

Member

  • Posts: 119
  • Joined: Sun Mar 11, 2018 3:46 pm
  • Location: North Carolina, US

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 3:25 pm

Electronics today are quite dependable, if they are designed correctly (physically). Some cheap computers have obviously no consideration for longevity. A bunch of plastic surrounds everything. They are cheap, and meant to be replaced when failed as long as they make it through the warranty period. What surprises me most is when I see pictures of computers tucked into a small crevice in an organ case with no consideration for ventilation! Even then, they last quite a long time (usually). When working hard, the CPU can generate a lot of heat. And if you get a lot of heat you will start getting errors. On my Mac, I use iStat (like a mini Omegamon for any mainframers out there), which right now is monitoring the temperature (among a ton of other things) at 27 places in my MacBook Pro. Right now (doing nothing really) the 4 CPU cores are around 112 degrees F (115, 113, 111, and 109), and the memory proximity measurement is at 104 degrees. The two fans are running at 2000rpm and airflow out temp is 93 degrees. When pushing this thing, the CPU's will start to get up around 190 degrees or more, and then the fans will slowly climb to 4400RPM. AND, I have an aluminum case which is a great heatsink! As you can tell, you have me really interested now with all of this. I unfortunately have recently had a knee replaced and can't fire up the Hauptwerk. When I can, I'll do a run and push everything and see what all these temps can get up to. I never really looked at temps, since my CPU's rarely work harder than about 40% capacity running HW (I do look at those numbers). I just make sure the box has a lot of room for airflow, as everyone should. Just note, when Apple released their new Mac Pro, they made a really big deal out of how long and hard they worked on silent airflow through the box to keep it cool. Check it out on their web page. Mimic them and you are good.

Sorry I rambled again - just sitting here unable to do anything but exercises!
Regards,
Ray
Ray
Offline
User avatar

Purator

Member

  • Posts: 138
  • Joined: Sat Dec 30, 2006 4:52 pm
  • Location: Leipzig, Germany

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 3:40 pm

Hello,

RAM for sure is very dependable these days. On the other hand, RAM also has become much more these days. ECC is, as far as my knowledge goes, most important for systems that write a lot of data like file servers. And of course, when a failure has some rather unpleasant outcomes. Hauptwerk on the other hand reads lots of data, but does not write much.

engrssc wrote:A wild idea with no realistic basis - running two identical computers simultaneously? Similar to my Toyota hybrid Prius which has 2 engines with the computer "brain" figuring out which one is "online". Which, BTW is totally seamless. 8)


Well, Hauptwerk needs really, really seamless. Computer-based interlockings work with three computers. Two of them have to get the same result and this result then is executed. It works with very small delay, but that delay would be too large for the audio of Hauptwerk.

Kind Regards,
Rico
Offline

RaymondList

Member

  • Posts: 119
  • Joined: Sun Mar 11, 2018 3:46 pm
  • Location: North Carolina, US

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 3:52 pm

engrssc wrote:A wild idea with no realistic basis - running two identical computers simultaneously? Similar to my Toyota hybrid Prius which has 2 engines with the computer "brain" figuring out which one is "online". Which, BTW is totally seamless. 8)

Rgds.
Ed


Actually Ed, you are just a little behind the times with your idea. Years ago, ATM networks used to run on "Tandem" computers. 2 of everything with instantaneous failover in the event of a problem. Founded in 1974, defunct in 1997. A baby that sat right next to the IBM mainframe I worked on back then. You have a great idea, just 46 years too late to make a nice profit!

Regards,
Ray
Ray
Offline
User avatar

nrorganist

Member

  • Posts: 41
  • Joined: Mon Sep 26, 2016 7:22 pm
  • Location: Northern Colorado, USA

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 6:31 pm

Ed, Drew and all,

Puget Systems is a successful custom computer build company for nearly 20 years. I have not been a customer of theirs, but respect the information which they share.

I found a late 2013 article on their website which appears to summarize ECC memory clearly, concisely and fairly. The article is at:
https://www.pugetsystems.com/labs/articles/Advantages-of-ECC-Memory-520/

In summary:

- ECC stands for Error Correction Code memory
- ECC memory protects against data corruption by automatically detecting and correcting memory errors
- ECC memory adds 1 memory chip to standard non-ECC memory's 8 memory chips on a memory card
- ECC generates a 7 bit code for each 64 data bits in memory
- on a 64 data bit read, a 2nd 7 bit code is generated and compared with the original 7 bit code
- if the two 7 bit codes match, then no errors
else the ECC memory system finds which bits are in error by comparing the two 7 bit code, then fixes the error'd data bits

ECC and non-ECC failure comparison
- In 2011, Kingston ECC failed .53% and Non-ECC failed 1.16%
- In 2012, Kingston ECC failed .19% and Non-ECC failed .93%
- In 2013, Kingston ECC failed .00% and Non-ECC failed .83%

- As of late 2013, their testing of non-ECC memory dropped their failure rate from about 1% to about .4%.

- Most desktop systems won't work with ECC memory or ECC functionality is disabled.
- Most server and workstation motherboards require ECC memory.

- ECC memory costs 10-20% more than non-ECC memory.
- ECC memory can be .72% to 2.2% slower than non-ECC memory.

-----

Alternatively, a way to be more confident of non-ECC memory is to extensively test each memory card.

MemTest86 (https://www.memtest86.com/index.html) has a free memory testing tool which I have used periodically for more than 15 years. MemTest86 has 13 different tests (for example: address tests, moving inversions tests, block move). MemTest86 also includes a test for Row Hammer errors (a common memory error defect).

From: https://www.memtest86.com/tech_individual-test-descr.html:

The row hammer test exposes a fundamental defect with RAM modules 2010 or later. This defect can lead to disturbance errors when repeatedly accessing addresses in the same memory bank but different rows in a short period of time. The repeated opening/closing of rows causes charge leakage in adjacent rows, potentially causing bits to flip.

This test 'hammers' rows by alternatively reading two addresses in a repeated fashion, then verifying the contents of other addresses for disturbance errors. For more details on DRAM disturbance errors, see Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors by Yoongu Kim et al.

-----

My long term MemTest86 experience has been - any of my memory cards (non-ECC or ECC) which has passed MemTest86, has not failed subsequently.

As you can see, both non-ECC memory passing MemTest86 tests and ECC memory could be options to reduce errors, particularly for public performance Hauptwerk systems.

Mark
Offline
User avatar

engrssc

Member

  • Posts: 6425
  • Joined: Mon Aug 22, 2005 11:12 pm
  • Location: Roscoe, IL, USA

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 7:16 pm

You beat me to the mentioning of MemTest86, Mark, which is for sure very useful. Another useful testing software, altho not directed at RAM faults is ATTO Bench Mark https://www.atto.com/disk-benchmark/.

I don't recall that much if any thing has ever been specifically focused on making Hauptwerk as a fail safe system. Understandably because the average HW user isn't needing such. But using Hauptwerk as an instrument used as a public performance instrument is a different matter. AFAIK, there really isn't many if any other good, practical alternatives in that regard. Greenwood UMC is a classic case in point as to what can be done. In it's own right, an amazing piece of work.

We know there are other proprietory extensive as well as expensive VPO systems But for many, this may not be a realistic approach whereas Hauptwerk does a remarkable job. The weak link is the hardware in most cases. Understanding what can be done is more than helpful for some/many of us. Again, thanks in sharing this information.

The RAM in my system does get noticeably warm at times, but also cools soon after the load is lessened. Hence my question if RAM coolers are useful. I'm using 2 sticks of 16 gb in a computer with 4 RAM slots leaving a fair amount of space between each stick. But considering in the future of going to 64 gb will eliminate that extra space.

As far as temperature concerns, the 2 tb NVMe SSD also gets warm even with a heat spreader attached.

Rgds,
Ed
Offline

RaymondList

Member

  • Posts: 119
  • Joined: Sun Mar 11, 2018 3:46 pm
  • Location: North Carolina, US

Re: Error Correction System for Hauptwerk?

PostSat May 23, 2020 7:52 pm

Yes, they are available. See for example:

https://www.newegg.com/corsair-cmyaf-fa ... 001T-00088

However, would your setup allow the addition of one or two muffin fans to increase the general airflow? That would be if you really think you need this. As you can see from nrorganist's post, RAM is getting more and more dependable each year. Don't forget, nothing is perfect. I've played many a pipe organ and had 'fun' during services!
Ray
Next

Return to Computer hardware / specs

Who is online

Users browsing this forum: No registered users and 2 guests