I'm interested in creating imu's as Musagi does not suit all of my needs as far as instruments go. (For many of the instruments it appears that the sound quality deprecates at the lowest octaves and I'd like to try and get really clean high quality notes in that area)
1. Anyone know of any particular ways of creating IMU's?
2. Dr Petter, how did you go about creating the imu's for Musagi?
Ideally I plan to create a program using the FMOD sound API to create and edit soundwaves and hopefully to output to IMU's. If Dr. Petter can help me out with the IMU file format and how to output I'd gladly release the application out to the community or even bundled with Musagi.
musagi uses portaudio, which works quite well. If you check into the open source downloads, you'll get all the files to compile your own version as I've done already. I gave up on creating a msvs project for it and went with DrPetter's commandline compiler.
You can use the gin_chip.h as template to create your own synth. Once you figured out the interval use, you should be able to put in larger cycles to obtain higher rates and better quality for the sounds itself. I've only tweaked it for some additional modulation access and a few tiny things, but it works great!
DrPetter is currently still on a vacation and will return in a week or two, I think. He'll be able to help you out further, of course. Just thought you might wonna get a little heads-up sooner!
Thanks a lot for the quick response taron, I'm currently working on a game for a contest and am on a really tight schedule. The reason I said I'd probably create a program with FMOD if need be was that I'm already relatively well-versed in it and it's already being used in my project. I don' think I have the mental room up in my head to learn another API, however from my understanding of what you just told me though I can just increase the corresponding interval values of the already existing synths to create higher (like chip or xnes) quality sounds/instruments at lower octaves. Is this true or did I misunderstand and if it is do you happen to know the source files and particular variable name values I need to increase to acquire better quality instruments? Thanks a lot for all your help taron =]
It's true, you should be able to do that (higher quality at lower levels by increasing the size of the waveform buffers and proper sample implementation). The problem with it as it is now is, that it's only offering a very small buffer that simple aliases due to the few samples it gets to read at lower frequencies. While there is a linear ramp from sample to sample, at such few samples per loop it just becomes noticable as stepping. If you increase the sample buffer to like 1024 or more, you can create finer transitions and simply more data that will not make the stepping recognizable anymore at lower percieved frequencies.
I don't think you can 'easily' port the implementation of portaudio to fmod. I had somehow never managed to deal with fmod, but I just didn't have enough eagerness and knowledge to do so. Portaudio, however, posed much lesser issues to me, while I had enough help from DrPetter to get into it.
Sorry for the leave of absence, been getting a lot of hours at work. Anyways I understand the logic of your method to improve the sound quality but after trying to look through the gin_xnes.h, have been unable to pinpoint how to increase the size of the waveform buffers due to my inexperience with Portaudio. Can you point me to the line of code where the value needs to be changed? I'd really appreciate it.
Also knowing your love for audio I figured I'd just bring up the discussion, have you ever used PxTone, in particular the PxVoice used to edit/create sound waves and create instruments through fine control of either the actual sine wave (or saw or just a blank template depending what you want to start with) or through the audibility of the different harmonics within the sample? I figured that'd intrigue you if you've never heard of it, I want to one day create a more powerful more fully featured version of that program. I think fmod is unable to do so, maybe portaudio or another audio API can help me out with that.
Hmmm...funny... but you're familiar with C, hmmm, I don't mean that in a sarcastic way, I'm really asking. It's a little confusing to get a variable called "envelope" as array pointer for the waveform, but there goes the sampledata. It's currently all on 256, while it uses 170 sample wide loop. This has nothing to do with portaudio at all. That's only to eventually output all the data as sound, but CHIP or NES are software synths that simply create waveform data, so to say. In theory you could go in and expand the waveform length from 256 to any number, I'm sure, and then extend the loops from 170 to any equivalent. I havn't tried it, yet, but I'd guess there's not too much trouble, except loading old sounds, of course (unfortunately), which is why I add terminators inside my own fileformats to have chunks of data for a specific array of flexible size, but well...
Anyway, I was curious about the pxTone stuff, but never found a viable download so far. Don't want to spend my day on searching, but if you have a few links, I'll definitely check it out! Sounds like chiptune stuff from what I got to read. Curious...love me my chiptunes!
I self taught myself C++ basics and some fundamentals of music theory. Other than that a vast majority of my programming experience is with the use of API's, learning to use them through just trial and error and reading the documentation. So my underlying knowledge is relatively poor although I plan to attend a University this coming fall as a Comp. Sci. major so I'll be able to do the things I want to.
Anyways, I'm having difficulty understanding what you're saying, from my understanding PortAudio only has to do with the eventual output and not the actual calculations of the sound. So I guess I have two questions, 1. Where are the values that I need to increase, is the 170 the 170 within the brackets of the envelop within the structure xnes_params or else where? Also where is the 256, is every single value 256 in the header file? 2. Do you know of any resources for me to read up on or use to educate myself further in terms of audio engineering in the manner we've been discussing?
There's a download link for Pxtone (PxVoice is a program with the PxTone folder) right at the top of the page of this tutorial.
Well, if you run musagi and look at a CHIP synth's interface, you'll see the envelope display, right? In it you can draw envelopes for the modulation of the sound over time, be it volume, phase or actually the waveform itself. This little window allows you to adjust only 170 values (that's why it's also just 170 pixels wide). In reality these 170 values are being placed into an array of 256 values. These arrays or all stored in an array of arrays called envelope. That's why in the code you will find it defined as...
That means there are 3 envelopes with each storing 256 values. Two of them are for modulation over time. One for volume, one for length of waveform (sync effect. The length of the waveform readout can be modulated, while the pitch automatically gets corrected. That's a so called "sync" effect.). The third of the envelopes is the waveform itself. It's a little confusing, when you look at certain parts of the code of the CHIP, as it's based on the XNES and has a few rather "dirty" adjustments in it. But somehow I got it pretty quickly.
PortAudio is basically just receiving chunks of waveform data that are being updated constantly. That's how a realtime synth works. A buffer is being filled with data, then the buffer is given to portaudio for playback. As it plays it back another buffer or part of a buffer is being updated from the software, given new data to play when portaudio is done with the previous one. Kind of like feeding data into a cycle. Portaudio doesn't have to do much else at this point. The synthesis and all the other funny stuff, like panning and effects are done software wise. I'm sure some fancy things can be done with the audio hardware, but that's a different story.
I'm pretty much in the same situation as you are, as I basically taught everything myself, too. I've only done that over a great many years, which might give me a bit of an edge. Additionally I had started wtih something like the programmer's latin, hahaha, as I started with disassembler on a C64 (basic on C16, actually, but that doesn't count, I think). If you don't understand the basic mechanics of how it all works by then, you can't really do anything, hehe. The most important thing it taught me is to be very inquisitive and investigate all that happens up until a part that seems to be impossible to investigate, like the workings of the ROM inside of those good old monsters or simply the code of anybody else, HAHAHAHA. I think that's what helped me the most to make it fairly easy for me to browse through DrPetter's code. It's not nearly as horrible as he would describe it to be. But maybe we just have a similar coding style or spirit behind it.
I hope that helped a little bit. I'm afraid that studying at a university won't teach you all that much, but it certainly will buy you the time to learn more dilligently yourself. Of course you will waste a lot of money doing all this, but that's appearantly how it works in the States. Sad thing, I think, because with all your energy, I think you don't need to hunt for degrees while being drained before you even get to earn anything. Different story, though...
I think I'm starting to get it, the reason the sound sounds crappy at lower frequencies is because it's reading a very small portion of the actual waveform, so if I increase the overall length and data of the waveform by increasing the values for the waveform data envelop in the chip synth, it'll sound better?
If I do that won't Musagi just be storing the same 170 values in just a larger container and therefore wouldn't make any difference? or am I misunderstanding?
Also at lower frequencies with chip when the note is first played there's kind of a thumping sound, any idea or explanation for what that is?
P.S. Have you had any time to play with PxVoice, and if so how do you like it/what do you think?
You'd have to interpolate differently, I think. You'd have to consider the pixels of the display rather as spline knots. Can be a simple spline, though. It wouldn't be perfect for all things, but in general it would create a smooth and larger waveform that can (and should) be played back at much faster rate to compensate for the pitch shift that happens, if you read through more data. It's like stretching it out, you know. That higher rate will prevent audible quantizations (aliasing like effects) and give you nice and juicy sounds at lowest frequencies.
It's really a simple logic, but it might be even easier to explain it with illustrations. Stretching out the data won't get you any advantage of course unless you do a proper interpolation. You might also use the XNES instead and generate the waveforms with some simple formulas (sine, ramps, etc...). That way it's perfectly smooth at all times.
Havn't played with PxVoice, yet, but I'm planning to! Thanks for that, again. Sounds really interesting!
No problem, I thought you'd like it after reading your posts and seeing how much you loved audio. Any idea how I would be able to go about interpolating it properly through manipulating the source code?
www.cubic.org/docs/hermite.htm There is another link to bezier curves in it, but the Catmull-Rom splines are probably the most important for you. If you go down to the code that feeds the interface input into the envelope, you'll find, I think, where you would want to squeeze in the interpolation. I'm currently not dealing with it, so you're a little on your own there, but it's really not too brutal in there. Once you get into it, it's even quite a bit of fun as I remember.
Yeah, pxVoice looks entertaining. I might have a closer look at their API, but all comments are in japanese as far as I can tell...which doesn't exactly help me, haha... AAAHHHRGH But it's cute.