Comments
-
Greg Hurrell
Note that file attachments might also be useful in the blog and the wiki, and should be easier enough to add; if they can be done for issues they can be done for the others too.
-
Greg Hurrell
As for the association type, there are two options:
Firstly, "issue/article/post has many attachments" ("has many"/"belongs to"). An attachment can only belong to one parent resource at any given time. Pros: access control straightforward, implementation is quick and easy, code is simple. Cons: data duplication if you want the same attachment to be attached to different tickets/articles/posts.
Secondly, "has many through/has and belongs to many". An attachment could be associated with lots of different parent resources, each represented using an "attaching" or "attachment relation" record. Pros: no data duplication. Cons: code much more complex.
For me this is a no-brainer and the first option is the way to go. Data duplication is an imaginary problem that should be dealt with if and only I come to a time where I start finding it excessive.
-
Greg Hurrell
Not to be confused with ticket #1276, which is about a generic file-upload interface for admin use, not related to the issue tracker.
-
Greg Hurrell
See also:
- ticket #446: "Use RESTful web service instead of email for receiving crash reports"
- ticket #448: "Integrate WODebug crash reports into Rails backend".
-
Greg Hurrell
Useful docs for serving files via nginx rather than tying up a Mongrel process:
- http://blog.kovyrin.net/2006/11/01/nginx-x-accel-redirect-php-rails/
- http://wiki.nginx.org/NginxXSendfile
- http://spongetech.wordpress.com/2007/11/13/the-complete-nginx-solution-to-sending-flowers-and-files-with-rails/
Evidently we don't want to stick files in the
public
directory directly, seeing as they could contain sensitive information and be attached to private tickets (a customer attaching proof of payment, for example). -
Greg Hurrell
So X-Accel-Redirect is the way to go for file downloads, but for file uploads may need another solution, at least in the long term. Even though nginx can buffer the upload and hand it over to Rails all at once, actually processing the data can still tie up a Mongrel process for a long while.
-
Greg Hurrell
For the file upload module, see:
-
Greg Hurrell
A semi-related post on handling multiple file uploads:
(Doesn't mention nginx at all but does touch on some UI design issues.)
After thinking about this for a while, here is my current thinking:
Downloads
Using
X-Accel-Redirect
is an absolute no-brainer. We delegate responsibility for streaming files off the disk to nginx, thereby ensuring that even for large files our Mongrel processes don't get tied up for long periods of time, and nor is there any risk of them trying to read large files into memory.And we get all the benefits of involving Rails in the process without any of the potential memory and CPU costs. That is, we can do things like access control, and have attachments marked as "private/public", "awaiting/not awaiting moderation", and having user ownership of attachments.
So, like I said, a total no-brainer.
Uploads
The situation is not quite as clear cut with file uploads. Even if nginx reads the entire uploaded file from the client before handing it off to a Mongrel process, said Mongrel process can still be tied up for a while when it has to do MIME parsing of the a potentially very large request. I'd have to do experimentation to see if memory usage was a problem as well, even though in theory it should not be (you never know).
Regrettably, nginx has not built-in equivalent of
X-Accel-Redirect
for handling uploads. There is a third-party module, commonly known as "the nginx upload module", which does look pretty solid and well-regarded in the community. It certainly looks robust enough to at least warrant a trial.(One possible doubt: I'm using the nginx 0.6.x stable series and the docs for the upload module only seem to mention the 0.7.x development version. I am not too keen on being pushed from the stable onto the development series; will have to do some testing to find out whether the module is compatible.)
According to the docs, you need
upload_pass
configuration for the URL at which you're planning on receiving file uploads. My understanding, then, is that if you want the ability to add attachments to different models (tickets, blog posts, articles etc) then you would need anupload_pass
set-up for each of those controllers, seeing as under a RESTful design you're going to be accepting those attachments in several different actions (eg.issues#new
,posts#new
etc), each of which with different URLs.That is quite duplicative.
Worse, each of those configuration locations would require a
upload_pass_form_field
directive with hard-coded form fields in your nginx configuration instructing the module what other form parameters must be passed through to the backend Mongrel process. In other words, adding a new field to a form would actually require an adjustment to the nginx config!So that really is starting to sound messy.
The alternative is to have a separate "attachments" controller and set up your
upload_pass
directives for a single URL on that controller. You thus avoid the duplication and reduce the amount of ugly hard-coding needed in your configuration file.But the downside there is that you can no longer open a new issue and attach a file at the same time, because submitting your issue and your attachment is now a two-step process involving two submissions to two different controllers. This is inconvenient for users, and worse still, it is a completely broken workflow for anonymous submitters (because those will have their issue submissions held for moderation, and they won't be able to attach anything at all until their ticket makes it out into public view, something that they may not stick around to see).
The compromise that I am thinking on settling on then is the following:
Basically, submitting attachments would be done via AJAX. The downside is that you can't attach anything unless you have JavaScript enabled (although it could be made to degrade semi-gracefully). The upside is that even issues held for moderation can now have stuff attached to them, and the actual UI for doing so would be fairly slick.
The workflow would look something like this:
-
user visits
issues#new
and fills out form -
clicks "add attachment" and uploads a new
attachment, effectively performing an AJAX request to
attachments#create
-
attachments#create
duly creates the attachment, which, depending on the current user, may be held for moderation or not, and returns a JSON response containing the id of the new attachment - the browser takes the id and adds a hidden field to the issue form noting that it has an attachment with said id
-
when the user finally posts to
issues#create
, the controller knows that the issue (apparently) includes an attachment with the given id
Observations:
- this works for multiple attachments at a time
-
when the attachment is first created it will have
NULL
values for the association (attachable_type
andattachable_id
) -
when the issue is then created, it should expect to find
that
NULL
association and take "ownership" of the attachment - to prevent race conditions and attacks (another user trying to guess attachment ids and take "ownership" of them by creating malicious new issue posts), we would have to rely on some kind of security through obscurity — that is, unpredictable ids — because we evidently can't set up any association at the time the attachment is uploaded because we still don't know (and can't know) what the id of the final "attachable" model will be
- some random hash-y thing should be good enough for that (eg. instead of predictable ids like "/attachments/1" we should have ones like "/attachments/af9a3cc83256e6388ad6677c7fb665db877a00b3")
- in the case of editing attachments of an existing "attachable" model this is evidently not a problem because in that case we do already know the type and id of the model
As far as how these things would actually be stored on disk, we don't want to risk too many files accumulating in the same directory so we could use a "partitioned" scheme; the "af9a3cc83256e6388ad6677c7fb665db877a00b3" example mentioned above could actually be stored at
attachments/af/9a3cc83256e6388ad6677c7fb665db877a00b3
, thus dividing the key-space into 256 buckets. (Although in reality we might just divide it into 16 buckets because the according to the docs for theupload_store
directive of the nginx upload module, "all subdirectories should exist before starting nginx", and that might be somewhat painful.)On the subject of "hashed" subdirectories:
-
http://www.serverphorums.com/read.php?5,5603,5604,quote=1: doesn't really answer my question; the poster says that
given configuration of
upload_store "/hdd1/_uploads" 1;
"i had to create 10 directories 0-9 in the above" - http://www.motionstandingstill.com/nginx-upload-awesomeness/2008-08-13/: also uses the same example, "The directory is hashed, subdirectories 0 1 2 3 4 5 6 7 8 9 should exist"
See also:
- http://www.motionstandingstill.com/ngnix-upload-awesomeness-pt2/2008-08-20/: a follow-up to the post linked-to above.
-
http://www.motionstandingstill.com/using-nginx-to-send-files-with-x-accel-redirect/2008-09-03/: related post on
X-Accel-Redirect
.
-
user visits
-
Greg Hurrell
A note on what I meant about "semi-graceful" degradation for browsers without JavaScript enabled:
Basically, you'd submit the form to
attachments#create
and show the user a page which said something like "your attachment has been created, this is the URL you can use to refer to it"; the message would be very similar for both moderated and unmoderated submissions (in the former case you'd see something similar to what people see when their issue is held for moderation; in the latter you'd see anattachments#show
action, which I suppose would give metadata about the attachment rather than showing the attachment itself).Could possibly add support to the wikitext module for turning strings like "attachment #x" where "x" is some integer into hyperlinks.
-
Greg Hurrell
Given that the
upload_store
directive refers to what is essentially a temporary directory, I don't think there's much need to worry about hashing it.For backend errors the
upload_cleanup
directive handles automatic removal, and for non-error code path we should really be moving the file to its final resting place (where we can implement whatever hashing scheme we see fit). -
Greg Hurrell
Having troubles with
proxy_temp_path
and the upload module (see "Updating to nginx 0.6.36 with the nginx upload module 2.0.9").A bit busy right now, but must find out whether this is a new bug in nginx 0.6.36 or is caused by the upload module.
-
Greg Hurrell
See ticket #1388 for another possible use for attachments: they could be used to deliver downloadable assets for appcasted releases. Still need to decide whether these attachments would be independent resources (no parent), or would actually be associated with the given release.
The nesting on the associations would be starting to get pretty deep:
-
/attachments/7d4e12254d339cce521486f482a02dc447a43b12
(parentless attachment) -
/issues/12345678/attachments/7d4e12254d339cce521486f482a02dc447a43b12
(attachment on an issue) -
/blog/foo-bar-baz-wow/attachments/7d4e12254d339cce521486f482a02dc447a43b12
(attachment on weblog post) -
/wiki/Patching_XYZ_1.0.2_for_Mac_OS_X_Snow_Leopard/attachments/7d4e12254d339cce521486f482a02dc447a43b12
(attachment on wiki article) -
/products/synergy/releases/12.10/attachments/7d4e12254d339cce521486f482a02dc447a43b12
(attachment on product release)
As I noted in ticket #1388, I was wanting release assets to have URLs like:
/products/synergy/releases/12.10/notes
So I might actually need two resources here:
- the attachments resource, which stores the actual asset as a parentless resource
- an "Asset" resource which would allow us to set up the "has_many" association on the product release, and would map attachments (assets) to releases
Not sure if that's the best way to go though.
Alternatives that would allow us to add attachments to releases and not create any other models:
- just tolerate the long, "ugly" URLs (one side benefit: people couldn't guess product download URLs prior to official release)
- add a "permalink" field to the attachment model which would allow an admin to define an access shortcut like "notes" or "dmg" which would be valid within the context of the parent resource
The latter has a fairly "icky" level of complexity just for the purpose of making pretty URLs. In reality the only user/consumer of such URLs would be the appcast client (the product itself) and the developer who has to make blog posts and link to the new version. In the former case appcast feeds should always be machine generated so it shouldn't be a problem. In the latter case the developer should probably just link to the download page, but the problem of providing the right download link there still remains...
-
-
Greg Hurrell
If we end up using attachments to distribute release assets (ticket #1388) will want to add a counter field to the model to count "hits". See the "Link" resource for an example of this.
Basically we'd increment the counter, check access permissions, then do the
X-Accel-Redirect
thing so that nginx could actually handle the download.
Add a comment
Comments are now closed for this issue.