I might (doubtful) want to try and sell this device, so I can't simply
take some Logitech web cam and use it due to obvious licensing issues.
So that's a huge constraint.
Basically you will be creating an IP Camera. The Raspberry Pi is probably the cheapest and easiest way to prototype this, however there are other boards like the Beagle Boards.
Once you get a prototype going, then you may consider creating your own all in one device that uses an ARM or DSP processor. For example I would probably use some type of Serial JPEG Camera Module, a cheap microphone, and the cheapest ARM processor that will fit these requirements. But for a prototype, the Raspberry Pi and a cheap usb web cam is probably the cheapest and quickest way to get started and get your software going. You may even to be able to find a cheap no-name usb camera from china that will work for this so you can resell it in small quantities.
So as far as getting the frames to the MCU, there's two main issues
here: (a) the choice of camera and microphone, and (b) the drivers for
connecting the cam/mic to the Pi's USB port.
Here is the huge list of devices that work with the Raspberry Pi: RPi Verified Peripherals. The USB Webcams section lists both working and problem units. Along with a bit of extra info. After you find a unit that fits your requirements (price, etc.) I would try to double check via Google to verify someone has used it and it does indeed work; although this is probably unnecessary.
I'm sure there are other units that work that haven't been tested, the two things that will help you is to make sure it is Linux compatible, and that there are ARM drivers available.
There's also the issue of A/V encoding as well as synchronizing the
video and the audio feeds together.
As I mentioned in my comment, the RP really won't have any issues handling this part. It has more than enough processing power to handle most all A/V formats.
the Pi (which would be running GNU/Debian linux) will have the right
drivers to ingest the streaming frames and send them off to a tool
that would then be able to forward them on to a WiFi or Ethernet
adapter
Basically the RP is just going to be a linux computer that is connected to the internet and has the camera and microphone plugged into it. You will install and configure libasound2-dev (for the audio) and FFMPEG to stream everything.
Then this will just be like any other server online (you may need to configure your router and port forwarding etc. to get it to be visible on the web,) and according to this, you and other computers will just access it by going to h**p://YOUR_WEBCAM_SERVER/webcam.mjpeg
Helpful links:
TI sells the CC3000 that is very much RTOSless. The stack and everything is on the chip/module, and you just need a driver that's about 6k of code, 3k of RAM (sometimes even less if you're willing to sacrifice throughput).
Last I saw, the Evaluation board which is just CC3000 (or booster pack for launch pad) was going for around $30.
Best Answer
It depends on the exact STM32 variant you're planning to use, but hosting a web page might very well be possible. If you're just making an one-off demo / proof of concept, you could consider using one of these:
Segger embOS/IP is available as evaluation version for various STM32 processors. See this page for some well documented & easily extendable examples:
https://www.segger.com/st-microelectronics.html
Segger has also USB RDNIS component available which would allow you to access the web server via USB, so you wouldn't need an Ethernet controller. I couldn't find any evaluation version though.
EmCraft has some system-on-module solutions running ucLinux available. They have a built-in web server which is easy to customize.
http://www.emcraft.com/products/224
ST Microelectronics has a code generator suite called STM32cubeMX. It might also offer a good starting point for developing web server.
Hope this helps.