Sample Vs Frequency Packets.

martyfmelb · October 15, 2010, 6:13am

300 sensor systems… I can’t say I’m aware of all of those, but I do know that there’s a whole lot of processing going on before what you see reaches the visual cortex.

In a loose sense, the entire visual cortex could be considered a large set of sensor systems, given that’s where you sense orientation (‘bar’, ‘grating’, ‘end-stopped’ a.k.a. corners), motion (direction-sensitive cells) and other higher-order stimuli. A stroke in the right place could mean something as specific as corners of things would be less attention-grabbing, or even altogether imperceptible to you. I once smoked salvia divinorum and I am pretty sure it did exactly this; the edges of things seemed to ‘run off past the corners’ – combined with the odd sense of somebody else controlling my movement, I naturally concluded that these were definitely force fields and I had to carefully step over each and every one of them …Anyway

So even before visual input comes near the brain, all this happens, and probably more:

In the retina, rods, cones and photosensitive ganglion cells directly take visual input - the ganglion cells are much deeper in the retina than the others - Cells in the retina perform edge-detection to compress the visual information enough to travel the optic nerve; basically, a high-pass/sharpen filter made of neurons - ‘Retinal ganglion cells’ (all 5+ different classes of them) send results on its merry way into the brain, some types extending all way way through the optic nerve
Types include: - Midget cells sense color changes well, and also sense great contrast changes (but not minor ones so much) - Parasol cells sense minor contrast changes well, but not color changes - Bi-stratified cells are known to sense moderate contrast changes, and are only affected by inputs from blue cones

Below: the insane amount of cells which process your vision before it gets to your brain. Direct input starts at the bottom of the pic, and moves up.

I should think the auditory system would be a lot simpler. I know there’s some sort of high-pass filter that works on a per-frequency basis, i.e. it effectively ‘tunes out’ unchanging, continuous sounds. I should expect the auditory cortex picks up on sudden stops and starts, since those are fairly attention-grabbing things. Probably some more cells for rising and falling tones, helping speech recognition, supplementing general sound recognition, fuelling the ‘infinite’ perception of the Shepard tone etc.

Basically, the rule of perception seems to be, think of every ‘event’ you recognise and there’s probably a cell or set thereof to ‘sense’ it. I guess a great portion of the entire brain is a some sensor or another.

EDIT: Back on topic – my ‘naive’ algorithm for baby’s-butt-smooth sound stretching, preserving pitch. Feel free to pilfer as desired, no patents here (that I am aware of).

Following on from basic FFT theory, sound could be considered to be made of sine-wave ‘grains’ of all different frequencies, amplitudes and offsets in time (roughly, I would guess that phase = grain start time MOD period).

A grain is a single period of a sine-wave; an ‘atom’ of sound
Grains/sec is low for low frequencies => little storage requirement
Grains/sec for treble is high => greater storage requirement

We may need some sort of timing information for every grain if we would like to preserve phase information. For now, that’s a bit over my head so I will politely forget about it

Anyway, into the meat of the pitch-shift. Let’s say we take the ‘grid’ approach, where you have a sort of grid, but every row is chopped-up finer and finer as you go higher frequency. Each cell represents a grain, represented by its amplitude only (frequency is inferred from the row number, timing is inferred from the column number).

A 2x speed-up would be trivial: remove every second (time-domain) cell, making the entire grid half-length.
A naive but decent 2x slow-down would simply involve playing every grain twice

Finer pitch changes would entail removing/repeating every Nth grain in each frequency, at its simplest (i.e. nearest-neighbour sampling). As you can imagine, we could apply any of the usual interpolations to recover any pitch we like.

And for pitch-shifting, just stretch the sound as above, then re-pitch in the usual Renoise way in the opposite direction, e.g. To double the pitch: double the length using the above method, then play an octave higher.

vicktech · October 15, 2010, 7:06am

May be, may be not.
We don’t know it for sure.
Auditory system needs biochemical recharge for each distinct sensor.
The same way with visual system - sensors have to do biochemical recharge before they can receive and transmit new data. Eye moves by arc trajectories to use charged cells for scan. And approximately 5 times a second eye goes totally off (for recharge), so it doesn’t see anything. But buffer part of brain glues these scans together so human doesn’t notice “dark times” (actually experiments shown them as gray light)…

martyfmelb, thanks for detailed comments about visual system of human.

What happens in eye is strange, but what happens in brain is even more strange and complex. Visual images are stored in frequency layers of cortex. They stored encoded way.
When one uses psylocibine or other psychoactive drug, he or she may notice the world looks at different. That is because response from brain now little bit differs from everyday’s, - that is why person can notice things those were hidden before.

That was good try, but it can’t reproduce drums.
We had topic about it, at OpenMPT forum.

vicktech · October 16, 2010, 9:56pm

Finally I can formulate What the Sound is!!

Data about sensor system from
It-Alien, kazakore, martyfmelb, kickofighto,
and my little experience with software like cooldedit2000 helped me to reconsider reality of the sound.

Vibrations are amount of quantums!
Sine wave at high frequency supposed to sound expressive when transposed to 2 octaves down.
But it does not happen!
You know why?
Because quantums are smothered!

As you can see each quantum is kind of “attack” of sound wave. And it “pumps” amount of energy.
First variant of transposed wave has its attack smothered - that is what happens when sample played at lower frequency.
So it does not pump!
There is second variant of wave transposed basing on quantum perception.
It pumps well!
That is why second variant will sound much better!

So using samples played at different speed in attempt to simulate notes is TOTALLY not right way of making many notes from one sample!

Rex_Sathum · October 16, 2010, 10:49pm

vicktech:

Finally I can formulate What the Sound is!!

Data about sensor system from
It-Alien, kazakore, martyfmelb, kickofighto,
and my little experience with software like cooldedit2000 helped me to reconsider reality of the sound.

Vibrations are amount of quantums!
Sine wave at high frequency supposed to sound expressive when transposed to 2 octaves down.
But it does not happen!
You know why?
Because quantums are smothered!

As you can see each quantum is kind of “attack” of sound wave. And it “pumps” amount of energy.
First variant of transposed wave has its attack smothered - that is what happens when sample played at lower frequency.
So it does not pump!
There is second variant of wave transposed basing on quantum perception.
It pumps well!
That is why second variant will sound much better!

So using samples played at different speed in attempt to simulate notes is TOTALLY not right way of making many notes from one sample!

I still don’t get it. I think we can all agree that sound physically manifests itself / travels as waves, which by their very nature have a certain number of cycles per second regardless of whether they are sound, light, water, seismic, etc. So I do not see how pitch (which is fundamentally based on frequency, i.e wave cycles over time) can be any better simulated / reproduced than by increasing / decreasing these wave cycles as is the current norm?

Also, what is the practical application of this from a musical standpoint. I don’t know many people who are trying to perfectly recreate the sound of an 88-key grand piano from a single C-4 sample? There are already many hyper-sampled and velocity-layered VST / AU instruments that facilitate this much more faithfully.

vicktech · October 16, 2010, 11:16pm

Just listen to yourself.
Sound manifests it self as vibration of particles - atoms and molecules of environment filled with gas under pressure. So how movements of atoms can be form of light which is vibration of EMF?

Still believing to your school books and teachers those never read anything except their job program?

This is a challenge for sound engineers. To accept it is not mandatory.

vicktech · October 16, 2010, 11:37pm

You tell me…

It looks like somehting about parasitive harmonics that happen because of stupid - stupid - stupid fixed sample rate (44100 for ex). Sampling should be done by chaotic sample rate to exclude this effect.

May be it is part of problem. only part. - otherwise there would be no this topic.

martyfmelb · October 17, 2010, 9:43am

hehe random lfo wired to a lofi sounds cool

vvoois · October 17, 2010, 9:49am

vicktech:

Bad thing about samples is mangling sound and duration when frequency changed by increasing or decreasing playback speed.
Bad thing about instruments - they need many samples to supress sample mangling. And they need volume corrections. Because higher frequencies put more energy out than low ones at same amplitude.
How about doing some RE-SEARCH in sound architecture and try to represent sample as frequency packet.
Sample will become sound processor ready. And despite wave structure will differ from original, even each time, - the sounding will be almost exact.
Frequency-energy compensation can be build-in feature of sound processor routine.

Besides sample will no more be affected by frequency change.
Instrument will obtain completely renewed meaning and become powerful and advanced as it ought be from the very beginning.

It will help polishing up samples and make them sound natural on other frequencies but it doesn’t make them more dynamic.
I’ld rather go that way, to go a step further and truly generate the frequencies and harmonies involved with the specific sound that belong to these sound packets.
A thing that Applied Accoustic Systems do in their plugins (Instrument simulation) and they are doing a pretty good job with certain instruments (String studio has some great electric guitar simulation imho), but their fomula’s only span up to a certain frequency range, so their plugin lack formula layers to recalculate frequency behaviors on different octave levels.
This is for instance the reason why their piano in String Studio sounds pretty great between a C5 and C6 but on the lower octave frequencies, the piano sounds unreal, the hammer impact isn’t correct and the harmonics don’t fit.

vicktech · October 17, 2010, 11:17am

I offer holographic approach as solution for this problem.
Each note of piano is recorded to the memory.
After analyzing it should become clear what frequencies change unlinearly.
How do they change exactly is represented by information of hologram array.
(Or you may name it just compressed data of interpolated frequency dependency)
Beating strings or scratching them gives harmonics those change not like others harmonics of the instrument. (actually i don’t know what does word “harmonics” mean.)

Also here is first explanation of the same approach: link

vvoois · October 17, 2010, 11:50am

“like noise from touching or scratching strings or beating at them will be transposed accordingly to hologram”

Yes, but the noise does not change in some cases… Hammers are the same for specific ranges of notes, so in those cases, the start of the strike should not be transposed at all.
This means that samples should be divided into smaller sections where certain sections cover a complete range of notes while the other sections are recalculated and polished on every frequency change.

Edit:Harmonics is a different deal, this is hard to compensate with fourier analysis because you need to know the harmonic frequency scale and adjust a sub-frequency inside the tone accordingly.
For me harmonics are also a bit hard to explain as i’m no expert either, but usually these are one or more extra frequencies sounding simultaneously along with the main generated frequency. The extra frequencies are side-effects due to resonance influence factors.

vicktech · October 17, 2010, 1:43pm

So will be it!!!
Holographic array can manage it all.

No. no. no. You’re taking it too simple.
Sample leased into frequencies at one axe, density/duration on the other axe.
So some frequencies are transposed one way, and others the other.
This way you can play drum by notes preserving noise from hit, but transposing other frequencies as desired.

vicktech · October 17, 2010, 9:39pm

Yes, it was interesting.
I can guess notes are being determined by dominating frequency and apparental overtones.
Also may be, after transposition EQ is applied to “polish” result somehow.
Though, for me it is not known how will transposed note will sound if shift is about 3 octaves.
And dont forget - there were 6 string of guitar, so at least 6 samples are available.
Im sure this is excellent tool for musicians. But if it so good - why do we still use samples and instruments?
The primary idea is to take real instrument and pack it into array of data from which instrument can be played at any sane frequency.
And I am very sure about my recent quantum sound explanation will become famous very soon, as soon as it will reach interested scientists. For me it seems even more fundamental than “frequency packets”.

martyfmelb · October 18, 2010, 12:50am

I got lost at ‘holographic array’. What does that mean in software (data structure/algorithmic/mathematical) terms?

simoon · October 18, 2010, 6:06am

vicktech · October 18, 2010, 10:54am

I’m thinking about it last 12 years.
In common terms its properties can be described as “multidimensional array”. Holographic is when each part has information about all array.
So doubling pieces - double precision.
Of course it “does not must” be holographic, by I sense it finally will be required.
Such arrays probably will be widely used in future, everywhere, from P2P video streaming to 3D modeling.
I offered it to torrent communities several times and one university, but people are “not ready” yet.
They are too busy by other problems, and multidimensional compressing sound to them like nightmare.
Particular implementation can be done one way or the other.
Provability of idea is based on terms of informational capacity of computer’s memory.

I just have red it all, somewhere partially.
Interesting.

They got very close, but their obstacle - is noise simulation.
Noise is usually supposed to be something chaotic.
And it is very hard to compress or analyze chaos.
So instead of analyzing chaos - let’s make chaos. Because to listener there will be no difference.
So the order at one hand and the chaos at the other. Now only remains to find fire air water eatrh and ether

By grains they mean very short time intervals.
By quantums I mean impulses. Let’s say 440 pulses/second. No matter what form of wave is used triangle sine round or square.
It will change only amount of energy “propulsed” to the space. Their sounds will be quite similar.
But if you make some mess with intervals between quantums - it will make difference.

Another thought has been formed by my mental processor:
Frequency, time, and whole length are inseparable. Sound instance is one entity that can be represent by FFT picture, by sequence of output levels.
As I see, there is something messed up with model of sound. It can not be solved by direct rough way.

kazakore · October 19, 2010, 9:06am

vicktech:

Finally I can formulate What the Sound is!!

Data about sensor system from
It-Alien, kazakore, martyfmelb, kickofighto,
and my little experience with software like cooldedit2000 helped me to reconsider reality of the sound.

Vibrations are amount of quantums!
Sine wave at high frequency supposed to sound expressive when transposed to 2 octaves down.
But it does not happen!
You know why?
Because quantums are smothered!

As you can see each quantum is kind of “attack” of sound wave. And it “pumps” amount of energy.
First variant of transposed wave has its attack smothered - that is what happens when sample played at lower frequency.
So it does not pump!
There is second variant of wave transposed basing on quantum perception.
It pumps well!
That is why second variant will sound much better!

So using samples played at different speed in attempt to simulate notes is TOTALLY not right way of making many notes from one sample!

You can’t do that, the first and second waves would sound totally different, one would contain a single pitch (sine wave) the other has loads of added harmonics!

The quarter wave time of going from zero to max is far less than is perceived as Attack even at the lowest notes registered by the human ear.

vicktech · October 19, 2010, 11:24am

Yes it surely does from point of FFT.

But picture above can yield some questions.

If I get it right pressure must fall back before ear can register pulse.
If you mean attack of instrument - attack can be represent as low frequency that doesn’t change from note to note.