An article about HTML and JavaScript—what's with that? I implement user interfaces for educational tools. Java on the client isn't really an option any more. The students use the tools through browsers, sometimes on devices without a Java runtime (such as an iPad).
For what I need to do, modern HTML and JavaScript are ok. I toyed with using ScalaJS or JSweet or Elm or whatever, but in the end, keeping it simple has many advantages. It was my good fortune that Internet Explorer ceased to matter when I started. I very rarely ran into situations where I needed to worry about browser incompatibilites. Except for drag and drop, the subject of this article.
I had to implement a “Parsons puzzle” tool. A Parsons puzzle is a programming activity where students rearrange code tiles into a code snippet. These are good activities for beginners because students don't have to worry so much about syntax. The tool can be set up with any sequence of code tiles. When I design a particular problem, I like to put in a few distractors so students have to think carefully and don't just get 100% by reproducing the basic shape of the program.
Obviously, dragging the tiles is a very natural user interface for this activity. So that's what I had to implement. (Here is a sample.)
How difficult can drag and drop be? You use it every day. You drag something and you drop it.
Actually, it isn't all that simple. First, you somehow identify the thing that you want to drag. How? You don't click on it with the mouse. That would be a click. Instead, you move the mouse over it, depress the primary mouse button, then, while leaving the mouse button depressed, move it a little bit. What about touch interfaces? It is similar, except there is a bewildering set of tap and pinch and pan and long-press gestures, and it seems hard to figure out which o that you don't want to interfere with.
That is why you don't want to use low-level events to implement your own drag and drop logic. Instead, the HTML standard provides an API that leaves the detection of drag and drop events to the browser (and ultimately, the host desktop).
The Drag and Drop API deals with four issues:
You can always drag images, text selections, and links. Usually you'll want to drag them outside the browser to somewhere else. Similarly, you can drag things from elsewhere into the browser (such as URLs or files from a file explorer). I am not interested in this form of drag and drop right now.
To make an arbitrary HTML element draggable, you must do three things:
draggable
attribute of the element to true.dragstart
eventYou really must do all of those things, whether you want to or not. If you do not set the transfer data, Firefox will not consider the element draggable. Is tempting to set the transfer data to a string like "Take that, Firefox!"
. But that might be embarrassing when the element is dragged into another app. Just provide some innocuous text.
Ponder for a minute what must go on inside Firefox. The drag gesture is initiated. The browser looks for the dragstart
listener. If it doesn't exist, it concludes that the programmer was just kidding when making the element draggable. Then the browser calls the dragstart
listener. When the listener returns, the browser checks whether the transfer data is really set. If not, it considers it a joke listener and refuses to start the drag. Who came up with this???
dragstart
and dragend
ListenersThe dragstart
event gives you a bunch of information about the drag start. The most useful are:
There is also a dragend
event where you can undo anything that you set up in dragstart
. This happens after a successful drop, or when the drag was canceled or abandoned.
Back to dragstart
. The cursor position is particularly valuable when you drag and drop a geometric shape. The user probably didn't touch it in its perfect center, and you will want to take the offset into account when the object is dropped. (See this blog for the geometry.)
Except (insert evil laugh) on the iPad. It took Safari on iOS ages to support HTML drag and drop at all, so you'd think they'd have had plenty of time to get it right. But it's not so. In iOS 12, the clientX
/screenX
and clientY
/screenY
coordinates are simply wrong. I have no idea what they report. They don't even depend on where you touch.
Weirdly enough, the Apple documentation states:
The
x
andy
parameters ofsetDragImage
specify the point of the image that should be placed directly under the mouse. This value is typically the location of the mouse click that initiated the drag, with respect to the upper-left corner of the element being manipulated.Unfortunately, obtaining this information in a cross-browser fashion is easier said than done. There is no standard way to determine the position of the mouse relative to the document because different browsers implement the standard event values in subtly incompatible ways.
In Firefox, Chrome, Edge, and Safari on the Mac, I can use clientX
/clientY
relative to element.getBoundingClientRect()
. Only iOS fails. Not in a subtle way.
As the drag proceeds, a copy of the dragged element is displayed. In my use case, that's the code line that is being dragged. That's good. The student drags teh code line to the left. I display indentation levels. The student aligns the tile and drops it.
Except (insert evil laugh) on the iPad. Unlike every other browser known to man, Safari on iOS takes it upon itself to resize the dragged element...to make it smaller and cuter and more adorable. Of course, with that adorable visual enhancement, now you can't align the tile any more.
It is possible to set an image instead. Do this in the dragstart
listener:
let img = document.createElement('img') img.src = ... e.dataTransfer.setDragImage(img, 32, 32)
If the image is small enough, then Safari on iOS leaves it alone, and my users can accurately position the drop location—particularly if they have transparent fingers.
Safari on iOS does not have a monopoly on idiotic behavior. With Chrome and Firefox, the drag image must be part of the DOM. Because the browser implementors couldn't be bothered to deal with this, you, the programmer, must insert it into the DOM and, because it would look stupid to see the image twice, hide it somehow. But the obvious hiding mechanisms (display: none
, left: -1000px
, etc.) cause the drag image not to show up. Try something like
position: absolute; left: 0px; top: 0px; z-index: -1;
Or preload another copy of the image.
Some browsers allow you to use a Canvas or SVG element for the drag image. Safari sensibly lets you use any element, except SVG didn't work on (you guessed it) the iPad.
To be a drop target, an element must provide both drop
and dragover
listeners.
It is not enough to merely provide a drop
listener, even if you have nothing useful to do in dragover
. The dragover
listener is fired “every 350ms (±200ms) milliseconds”, and the browser doesn't take it lightly if you don't provide that listener.
And there is always one thing to do. You must call e.preventDefault()
in the listener. Because if you don't, then the drag is deemed to be canceled. Apparently you need to reassure the browser every few hundred milliseconds that you want to continue the operation.
There are also optional dragenter
and dragleave
events that you can use to light up drop targets, typically by adding and removing a style. There is no equivalent of a pseudoclass like :hover
for drag and drop. (The W3C spec says that you must provide a dragenter
listener, but that doesn't seem to be enforced.)
In dragenter
, call e.preventDefault()
when this target is willing to accept the drop. Otherwise, return without calling preventDefault
. In dragleave
, you don't have to call preventDefault.
The Mozilla docs say cheerily: “The dragleave
event will always fire, even if the drag is cancelled, so you can always ensure that any insertion point cleanup can be done during this event.” It ain't so. If the element is dropped, then dragleave
is not called (at least with Chrome). You need to duplicate the dragleave
cleanup in the drop
listener.
Firefox also has a dragexit
event that is fired just before dragleave
, but other browsers don't have it. Just ignore it.
Finally, there is a drag
event that keeps occurring throughout the drag operation. You don't have to listen to it, and I don't know why you would want to. But if you do, be sure to call preventDefault
, or the operation is canceled.
During the drag process, the browser may provide some visual feedback that indicates the nature of the operation, typically with different mouse cursors. The effect choices are:
move
copy
link
The default is move, but users can select copy or link effects by hitting keyboard modifiers.
In the dragenter
, dragover
, and drop
handlers, you get the provided effect, as a string, in e.dropEffect
.
If you like, you can override that effect, by assigning a different string. For example, if a particular drop target always wants the move
effect, no matter what modifier keys your users press, add this call to the dragover
listener:
e.dataTransfer.dropEffect = 'move'
But it's not that simple. The drop target isn't in sole control of that process. Perhaps the drag source (which might come from a different web app) isn't happy with a move
operation.
The drag source must set the allowed effects to one of the following nine strings:
none
copy
copyLink
copyMove
link
linkMove
move
all
uninitialized
If you implement drag and drop within one web app, just choose all
in the dragstart
listener and then deal with the details in dragover
:
e.dataTransfer.effectAllowed = 'all'
There has been a fair amount of confusion about this at StackOverflow, but I think that's because people don't always think about cross-app drag and drop. Once you consider that, the setup makes sense.
In the drop
listener, first of all, call stopPropagation
for a successful drop, or return without calling it if you reject the drop.
You can, if you care, find the drop effect that the user selected, and act accordingly.
If you want to handle a cross-app drop, you then retrieve the transferred data. This is its own can of worms, because the browser support is inconsistent. Reliably sending a JSON string from one app to another seems quite an adventure. I am not going to dwell on it here.
For drag and drop in the same app, don't bother with data transfer. Just set up an object in dragstart
and retrieve it in drop
.
Here is a template with all the operations in one place. Keep in mind that this is for drag and drop within a single app. Otherwise, you need to delve into the data transfer.
Do this for the drag sources:
let dragData = undefined let dragSource = ... dragSource.setAttribute('draggable', true) dragSource.addEventListener('dragstart', function(e) { e.dataTransfer.effectAllowed = 'all' e.dataTransfer.setData('text/plain', dragSource.textContent) // Firefox needs this let bounds = dragSource.getBoundingClientRect() dragData = { x: e.clientX - bounds.left, y: e.clientY - bounds.top // Any other data... } // Any other setup... }) dragSource.addEventListener('dragend', function(e) { // Any other teardown... dragData = undefined })
Do this for the drop targets:
dropTarget.addEventListener('dragenter', function(e) { if (dragData === undefined) return // Some sort of foreign drop e.preventDefault() dropTarget.classList.add('dragover') }) dropTarget.addEventListener('dragleave', function(e) { dropTarget.classList.remove('dragover') }) dropTarget.addEventListener('dragover', function(e) { e.preventDefault() e.dataTransfer.dropEffect = 'move' }) dropTarget.addEventListener('drop', function(e) { dropTarget.classList.remove('dragover') // Duplicate from dragleave if (dragData === undefined) return // Some sort of foreign drop e.preventDefault() // Drop action... })
I am not the only one who thinks HTML Drag and Drop is a drag. Check out these blogs for more exasperation and other tips.
Why can't this spec get some love? A great deal of the busywork could be removed. Imagine that a drag source would just have to listen to dragStart
, and a drop target only to drop
. Imagine there was a pseudoclass for accepted drop targets.
Why do browser implementors have this passive-aggressive attitude towards drag and drop? The pain points could be fixed with a minimum amount of trouble. Surely far less than the trouble that they went through in the first place to implement the spec. I guess it's that iron law of software engineering: “If it was hard to code, it should be hard to use.”
Comments powered by Talkyard.