How to change a voice's pitch in real time?

Discussion:

(too old to reply)

Piotr Mancini

2020-07-13 03:58:26 UTC

I just learned how to convert an audioclip from a 33.3 rpm vinyl record to 78 rpm, here:

https://community.adobe.com/t5/audition/converting-33-recording-to-78-how/td-p/9448709?page=1

What I need is similar but probably harder. I am developing a web application based on a segment from a 2013 TV program. Only the first seconds are relevant:

The clone of that segment is an interactive application, under construction, seen below. Notice how the user has two degrees of freedom: the vector's magnitude and its angle.

http://www.dealey-plaza.org/this-government-as-promised/SBT-MBT-Tools/Haags-Measurement-Tool/

I already have the code that will receive those two variables as the user drags the mouse around. The effects will be:

- Vector Magnitude changes: the voice volume increases/decreases. This is most likely the easy part.

- Vector Angle changes: as it is modified the pitch (listen to the two extremes attached) will vary.

What I need is the back-end part (library, etc).

The question is actually more general than simple signal intensity and frequency. I need to "modulate" a signal based on user's activity. Any recommendations are welcome.

TIA,

-Ramon F. Herrera
JFK Numbers

Piotr Mancini

2020-07-13 19:57:15 UTC

Permalink

Post by Piotr Mancini
https://community.adobe.com/t5/audition/converting-33-recording-to-78-how/td-p/9448709?page=1
http://youtu.be/8MF04X2aLBw
The clone of that segment is an interactive application, under construction, seen below. Notice how the user has two degrees of freedom: the vector's magnitude and its angle.
http://www.dealey-plaza.org/this-government-as-promised/SBT-MBT-Tools/Haags-Measurement-Tool/
- Vector Magnitude changes: the voice volume increases/decreases. This is most likely the easy part.
- Vector Angle changes: as it is modified the pitch (listen to the two extremes attached) will vary.
What I need is the back-end part (library, etc).
The question is actually more general than simple signal intensity and frequency. I need to "modulate" a signal based on user's activity. Any recommendations are welcome.
TIA,
-Ramon F. Herrera
JFK Numbers

Wow! This used to be one of the few Usenet newsgroups that had survived the onslaught of the sons/daughters of bitches. In fact, the production and interchanges were remarkable.

The bastards killed it!

Is there ANY Usenet Newsgroup that is actually functional?

-Ramon
JFK Numbers

boB

2020-07-16 01:06:38 UTC

Permalink

On Mon, 13 Jul 2020 12:57:15 -0700 (PDT), Piotr Mancini

Post by Piotr Mancini

Wow! This used to be one of the few Usenet newsgroups that had survived the onslaught of the sons/daughters of bitches. In fact, the production and interchanges were remarkable.
The bastards killed it!
Is there ANY Usenet Newsgroup that is actually functional?
-Ramon
JFK Numbers

Yes there are a few I think. But not much.

I remember this group in the early 1990s !

People here helped me out when I needed another DSP56001 processor and
someone sent me one ! Still have it today.

Long live comp.dsp

Or something like that...

boB

g***@u.washington.edu

2020-07-14 01:53:27 UTC

Permalink

On Sunday, July 12, 2020 at 8:58:29 PM UTC-7, Piotr Mancini wrote:

(snip)

Post by Piotr Mancini
- Vector Angle changes: as it is modified the pitch (listen to
the two extremes attached) will vary.

More usual are ones to change speed, but not pitch. Since you
can change the sampling rate, that is equivalent.

Before digital, there were analog tape players that did this
using a moving head.

You can speed up voice by cutting out segments, long enough to
determine pitch, but short enough not to determine phonemes.

With the popular 44.1kHz sampling rate divisible by 3,
(3*14.7kHz), you could speed up by 1.5 by removing 0.1s every 0.2s,
so remove 14700 samples, then leave 29400 samples.

Now you want to resample. Since it is hard to describe a better
way in a short note, double every other sample of the 29400 sample
fragment. You should probably low-pass the result, but
maybe close enough.

Piotr Mancini

2020-07-14 03:00:55 UTC

Permalink

Post by g***@u.washington.edu
(snip)

Post by Piotr Mancini
- Vector Angle changes: as it is modified the pitch (listen to
the two extremes attached) will vary.

More usual are ones to change speed, but not pitch. Since you
can change the sampling rate, that is equivalent.
Before digital, there were analog tape players that did this
using a moving head.
You can speed up voice by cutting out segments, long enough to
determine pitch, but short enough not to determine phonemes.
With the popular 44.1kHz sampling rate divisible by 3,
(3*14.7kHz), you could speed up by 1.5 by removing 0.1s every 0.2s,
so remove 14700 samples, then leave 29400 samples.
Now you want to resample. Since it is hard to describe a better
way in a short note, double every other sample of the 29400 sample
fragment. You should probably low-pass the result, but
maybe close enough.

Thank you so much! Finally...

What I need pretty much is an OSS library to manipulate audio signals. The more high level (audio-specific?), the better.

Once I have that, I will either code the app myself, or (most likely) get a hired gun (Freelancer) to do the implementation by you described for me.

Thanks again!!

-Ramon
JFK Numbers

Sebastian Doht

2020-07-16 19:21:21 UTC

Permalink

Post by Piotr Mancini

Post by g***@u.washington.edu
(snip)

Post by Piotr Mancini
- Vector Angle changes: as it is modified the pitch (listen to
the two extremes attached) will vary.

More usual are ones to change speed, but not pitch. Since you
can change the sampling rate, that is equivalent.
Before digital, there were analog tape players that did this
using a moving head.
You can speed up voice by cutting out segments, long enough to
determine pitch, but short enough not to determine phonemes.
With the popular 44.1kHz sampling rate divisible by 3,
(3*14.7kHz), you could speed up by 1.5 by removing 0.1s every 0.2s,
so remove 14700 samples, then leave 29400 samples.
Now you want to resample. Since it is hard to describe a better
way in a short note, double every other sample of the 29400 sample
fragment. You should probably low-pass the result, but
maybe close enough.

Thank you so much! Finally...
What I need pretty much is an OSS library to manipulate audio signals. The more high level (audio-specific?), the better.
Once I have that, I will either code the app myself, or (most likely) get a hired gun (Freelancer) to do the implementation by you described for me.
Thanks again!!
-Ramon
JFK Numbers

Not sure if it has exactly what you have been looking for, but have you
had a look at the open source audio editor Audacity
(https://www.audacityteam.org/)? If it can accomplish what you need you
might be able to extract the required functions from its backend library
portaudio.

Greetz,

Sebastian

g***@u.washington.edu

2020-07-17 00:18:30 UTC

Permalink

On Thursday, July 16, 2020 at 12:21:25 PM UTC-7, Sebastian Doht wrote:

(snip)

Post by Sebastian Doht
Not sure if it has exactly what you have been looking for, but have you
had a look at the open source audio editor Audacity
(https://www.audacityteam.org/)? If it can accomplish what you need you
might be able to extract the required functions from its backend library
portaudio.

Note that the OPs question can be answered with two operations.

One changes the speed without changing the pitch, then resampling
to get the original speed with pitch change.

The OP asked for 'real time', which I will interpret as minimal delay.
It can't be done with zero delay, but maybe close enough.

Piotr Mancini

2020-07-14 03:48:35 UTC

Permalink

Post by Piotr Mancini
https://community.adobe.com/t5/audition/converting-33-recording-to-78-how/td-p/9448709?page=1
http://youtu.be/8MF04X2aLBw
The clone of that segment is an interactive application, under construction, seen below. Notice how the user has two degrees of freedom: the vector's magnitude and its angle.
http://www.dealey-plaza.org/this-government-as-promised/SBT-MBT-Tools/Haags-Measurement-Tool/
- Vector Magnitude changes: the voice volume increases/decreases. This is most likely the easy part.
- Vector Angle changes: as it is modified the pitch will vary.
What I need is the back-end part (library, etc).
The question is actually more general than simple signal intensity and frequency. I need to "modulate" a signal based on user's activity. Any recommendations are welcome.
TIA,
-Ramon F. Herrera
JFK Numbers

This needs further explanation. I will try to be as succinct as possible.

If you prefer technical issues only and don't care about history, politics and controversy please STOP reading now. Move on.

If you haven't please watch this videoclip and pay close attention. That is the most advanced study ever done of the shooting, it was paid with the unlimited expense credit card of the Koch brothers.

http://youtu.be/8MF04X2aLBw

There have been 12 "scientific" studies of the Kennedy murder. All have produced pre-ordained results, some are LN ("It was Lee, alone, 3 shots") the rest are CT. What they have in common is that they are all fraudulent. Every single one of them. See them here:

https://archive.org/details/@the_12_fraudulent_studies?sort=titleSorter

This is also important. Below is the front end to the audio-distortion application, just click on "Continue".

http://www.dealey-plaza.org/this-government-as-promised/SBT-MBT-Tools/The-12-Fraudulent-Studies/

One of my fundamental beliefs is this, as I told a book author:

- Chances of a lawyer admitting to a counterpart: "You were right all these years, it was a conspiracy"?
Hades will proverbially freeze over before that happens.

- Chances of a physician telling a dissenting colleague: "You were right on the autopsy X-rays, the fatal shot did not come from behind"
Similar to the odds above.

- Chances of an engineer, physicist, 3D designer, etc.?

Now we are talking. And I mean that in the literal sense: They ARE talking. Our colleagues are. Notice these 2 images:

Loading Image...

Loading Image...

That is enough intro. Later, I will be asking (make that: begging) for help on the audio aspects of "The Subject That Never Dies".

The official version is a dead man walking, BTW. It is up to us, numerically trained people (who have been away from the case, always controlled by liars, err, I mean: lawyers) to solve forever that tragic event. We cannot allow the Fake News, haters of academia, science, logic, MAHA hat wearing types to destroy history.

-Ramon
JFK Numbers
ramon at jfknumbers dot org

g***@u.washington.edu

2020-07-17 00:22:56 UTC

Permalink

On Sunday, July 12, 2020 at 8:58:29 PM UTC-7, Piotr Mancini wrote:

(snip)

Post by Piotr Mancini
The question is actually more general than simple signal
intensity and frequency. I need to "modulate" a signal based
on user's activity. Any recommendations are welcome.

Reminds me of at a seminar last year (that is, pre-Covid) wondering
about an accent remover. The seminar speaker had a strong accent
which made it hard to understand. (That was in CSE, too!)

At a later seminar, related to deep learning and neural nets,
someone had a system that might be able to do it. It would use
one voice as a training set, then convert others into that voice.

But note that this could also be done using a speech to text system,
followed by a text to speech synthesizer. Not real time, but maybe
close enough.