How to convey with an URI what is content type of the resource

url

I would like to use URIs to represent different files we can use in our system. But to know which module to use to parse it, it would be great if I could somehow encode what is content type of the resource to which it is pointing at, so that it can be given to a proper module for parsing.

I was thinking to maybe extend scheme part to convey this. For example file+csv:///path/to/file would point to a CSV file, while file+caffe:///path/to/directory would point to a directory with Caffe model and parameters. And so on. I have a limited set of types I want to support so this seems a reasonable way?

But is there some other standard way?

Best Answer

URLs by themselves are very protocol-agnostic. They do not specify much more than a common syntax and basic semantics. An URL generally describes how to find something, but not what you'll find there.

It is the job of a particular protocol such as HTTP to indicate the content type. Some resources do not have a meaningful content type, for example mailto: URLs. The FTP protocol has no concept of MIME types, but merely distinguishes textual files, binary files, and directories (specified as a ;type=<typecode> parameter in an FTP URL). Regarding file URLs, RFC 1738 Uniform Resource Locators (URL) notes:

The file URL scheme is unusual in that it does not specify an Internet protocol or access method for such files; as such, its utility in network protocols between hosts is limited.

RFC 8089 The "file" URI Scheme concurs:

The file URI scheme is not coupled with a specific protocol nor with a specific media type [RFC6838].

So most URL schemes do not allow you to include the content type in the URL, and there is no scheme-agnostic mechanism to do that.

You can of course develop your own non-standard URL scheme that consists of MIME type + transport. It would be best to not put the type into the scheme name: I'd consider a design such as example:text/csv:file://path/to/file.

Alternatively you could store the type in a query param of a file URL – except that a file URI syntax as defined by the RFC does not have query parameters. This also may lead to problems with some implementations on Windows systems. But this has the advantage that query params for file URLs are ignored by parsers that use the WHATWG's generic URL parsing algorithm.

Related Solutions

Rest – Is is OK to use a non-primary key as the id in a rails resource

Actually, there are 3 reasons:

Convention
Convention
Convention

Convention is important, because it makes it easier for other programmers to understand your code. In RoR it's even more important, since many facilities of the framework and 3rd party plugins are assuming you are following the convention. Often they'll offer a configuration interface you can use in case you violate the convention, but this will make their usage more verbose, complicated, and error-prone.

So, you should only break convention if you have a really good reason to do so - and this isn't one, because there is a much simpler alternative: using the GET parameters!

Instead of hijacking the show action, create a new route - e.g. find_and_show. That action will accept the GCM id as a GET parameter instead as part of the path, so the path will be:

GET devices/find_and_show?gcm_id=<some_long_gcm_id>

The method should find the Device by the gcm_id, and the rest of it could be shared with the regular show method, or even delegated to it with a redirect.

This way you are not breaking any convention, you are making it clear that you are searching by GCM id rather than by the PK, and if you ever want to allow retrieving via other unique identifiers you can simply make find_and_show support more GET parameters.

And yes, it'll make the URLs longer, but it's not like humans will type these URLs...

MVC URL Structure with URI Parameters

I would say it depends on your needs.

Do you need to know the name of the variable?
Do you want the (multiple) variables to be passed in a specific order?
Is the amount of variables going to grow (like filters)?
Does it need to be part of the url? (i.e. for SEO reasons)

I mostly know two variants (different then the examples you gave), and in applications I write I catch these in these ways:

1. In the url: www.youtube.com/brandname

Using a regular expression I predefine the names of the different parts in the url. And thus their position is (mostly) fixed.

/(?<id>[0-9])/(?<slug>[a-z0-9])/

Then you can pass these named or unnamed to the controller. How doesn't matter that much, as you already know which are coming in already.

function channel($id, $slug)

2. In the GET/POST request: www.google.com/?q=foo&client=bar&channel=baz&...

Just accessing the GET/POST directly.

function search() {
    if (empty($_GET['q']) || invalid($_GET['q'])) {
        // return to user
    }

    $query = $_GET['q'];
}