My guess is that it will be a problem with handling Unicode in some way. There will be a page on emojipedia with some Unicode emojis in the title or description that Google can’t handle while trying to extract the text to display a result.
After seeing this, I can't be the only one who got curious as to how many emojis there actually are on iOS.
Obviously, a quick google doesn't work right now.
So I did this to try to figure out: emojipedia.org, the site that supposedly breaks google, has a page that appears to show all the available emojis on iOS [1]. On this page, all except the first 21 emojis are displayed in a way that uses lazy loading of the images. These images are contained in <li class="lazyparent"> elements. Assuming that there are no other <li class="lazyparent"> elements on the page, we should get the number of emojis on iOS simply by counting those elements.
You can also use `querySelectorAll` and also pass parent’s selector. Didn’t get to post while I was back at computer, so can’t recall exact selector, but number was same
I think my favorite part is that this is so outrageously unheard of that the error isn't even CSS-styled correctly. Zero left margin, just up against the edge of the window. Nobody even thinks about this case of an error being displayed because it's so rare.
It also makes me think that there's probably broken ownership of the plumbing around the obviously-novel routing/layout/presentation path taken when this error is shown.
A while back while poking around I landed on an interstitial page of some kind using an old 2005-era Google logo, which was cute. Obviously nobody'd noticed and submitted the page in question to whatever bit of internal gunk was presumably used to track where the logo needed updating.
But that makes me wonder - the glitch I saw was a platform-level thing easily missed - but is Search that much of a vast, not-particularly-cohesively-integrated expanse too?
Given amount of searches, your counterexample being from 2020 seems to be just proving the point.
But I would imagine in most cases, when there's backend error you could fall back on some simpler engine and display that without user having any clue.
No. That is insane to even think about. Google practices blameless postmortem processes that focuses on avoiding repeat incidents and fixing processes. This is not just something we write down in books but something we do on a daily basis and believe in it.
The only way I can think of someone being punished is, of course, it was maliciously done.
Disclaimer: I am a Google SRE, opinions my own, not an official comment.
But the emoji search addin feature will be developed by a team in mountain view... So if emergency code changes need to be made for this bug HN has just found, they will still be woken up.
Possibly, but probably not. Google tends to have follow-the-sun rotations staffed by SRE as the first line of defense, with an escalation path to devs if necessary.
SREs may not have intricate knowledge of the code but they have an array of tools that can mitigate problems, and they're software engineers too, so they can debug this themselves if need be. Their focus however won't be on fixing the bug, it will be on stopping the bleeding. Typically they'll check if this is an issue introduced in a recent rollout and roll that back. In the case of Search they also have an array of tactical tools - someone mentioned that a specific website was causing this, they probably have a way to quickly and temporarily delist this specific result.
The focus will be on recovering ASAP, figuring out the details and the long-term fix later. That later can be during business hours on Monday.
Any major product at Google has an on-call team with offices in multiple time zones. Typically, two engineers from different time zones will be on-call at any given time—they switch back and forth between primary and secondary, so nobody is primary on-call during the middle of the night, local time.
For example, you might have an SRE team in Mountain View, and an SRE team in Dublin. Maybe the engineer in Mountain View is primary on-call until midnight, and then it switches to the engineer in Dublin, who starts their shift at 8:00 AM.
If it’s getting code changes, it’s not getting code changes right now. The software engineers may be in Mountain View, but they won’t patch this and push out a new version during the night. Someone (read: an SRE) may change a configuration or push out a temporary fix now, and any code changes will only go out after significant testing. Generally speaking, you don’t hotfix high-profile production services. You rollback, you set up filters, you disable features, you turn off component services, you run in a degraded capacity—but if you want to actually change the code, you take your time and do it right. Rushing out a code change in the middle of the night is liable to make things worse, and nobody wants to do it.
For the right definition of "major". Search, Ads, GMail, Docs, absolutely. If you only have 50 million DAU then you may not even have an SRE let alone one in multiple time-zones. Most teams have an on-call rotation and you get a phone-call an hour after you fall asleep unless you recently did a push or go to bed early. (Or you just forget all the pages that happened during normal hours and only remember the annoying ones.)
It really depends on what is happening behind the scenes. Is it killing an entire server process or just an error in a single thread? Even if it kills an entire server process, Google has lots of these running and the ratio may not be high enough to page.
That depends, I don't think that there needs to be anything emoji specific (at the character level). It could be a special handler for inline results, marketing filters, etc that's gated on some variation of that string.
Might be worth seeing if there's another "how many X on iOS" that faults as well, because I would be that it reports an ISE on timeouts, and you could easily imagine some service making a follow on request that now times out.
It's likely that there is some tool for blocking or rewriting certisn queries or results that trigger errors so that the code changes can be made with less rush, more testing and during business hours.
Why do you think so? Lots of important systems are developed in European offices at least, I wouldn't be surprised if some are in Asia as well. It isn't like Google search is only in MTV.
Would be definitely interesting to read a report on what triggers this error! I don’t think it is a tile error as presumably the backend that collects all the answers would be set up in a way that is resilient to upstream errors. So maybe the core results compiler?
I'm not sure exactly how it works but if you look at the network requests for "how many emojis ios" vs "how many emojis android" the html for the initial page contains an internal server error whereas the android version contains some more javascript which contains the search results.
The likeliest candidate is probably the final render step. Under the hood, search farms the query out to multiple backends, gathers partials from them, and then composites those partials into the final rendered page. For an actual 500 to surface to the user, that final compositing step is probably what fails; gathering and compositing are already robust against an individual backend failing to return a result.
I'm on board with other people's suggestions that asking a question like this is tripping over a site that surfaced an emoji in their site title that is choking the Unicode library that Google is relying upon.
Server Error
We're sorry but it appears that there has been an internal server error while processing your request. Our engineers have been notified and are working to resolve the issue.
Please try again later.
It appears to cause errors with my Google Home as well. Asking it "How many emojis on iOS?" (exactly the same as the OP string) causes my google home to stall for quite a bit (about 20 seconds or so).
Looks like it was just fixed as I got an infinite load a few minutes ago but now it works fine. My fanciful guess is that they use a LM (language model) to determine if something is an answerable question and then if it is parse certain pages with an LM if it is classified as an answerable question. Maybe the LM stuttered on something on emojipedia.
I would guess it has nothing to do with the index, and more with widgets - these special little result boxes that appear when you do a calculation, ask for the weather, and so on. Google has quite a few of them, and some are pretty whimsical. I read that new people on the search / frontpage team get to write on of those as a training task. They must be quite encapsulated from the rest of the system. But obviously at some point one of them is going to have a bug.
Just guessing, maybe it was supposed to show a certain emoji, like "hamburger emoji on iOS", and got tripped up by the result of a search like "how many site:emojipedia.org" which leads to the FAQ and that can't be parsed?
Server Error
We're sorry but it appears that there has been an internal server error while processing your request. Our engineers have been notified and are working to resolve the issue.
Fascinating! It may be trying to create that response panel on the top with the absolute number but maybe it's erroring out when trying to parse the actual Unicode points.
According to this https://news.ycombinator.com/item?id=32393051 , a couple years ago people were getting the "We're sorry but it appears that there has been an internal server error while processing your request. Our engineers have been notified and are working to resolve the issue." and it was because of a fireball?
Not only on .nz ... this happens on the main site google.com too. Displaying the message "Server Error : Internal server error..." after spending a long time for the page to download.
That indicates every search we are making might be triggering a high CPU load somewhere before displaying the message.
Server Error
We're sorry but it appears that there has been an internal server error while processing your request. Our engineers have been notified and are working to resolve the issue.
Please try again later.
Taking a while to load is probably indicative of the problem.... There's probably some O(n^3) memory or compute complexity going on somewhere of the number of emojis on the page.
For example, I imagine Google uses as a final ranking step some stuff based on the similarity of the pages about to be returned - to make sure you aren't about to show the user two pages that are practically identical. That logic might try to build similarity mappings between the pages, and has logic that fails badly for large numbers of emoji.
I would guess it's the time that is the problem - whatever service would produce an inline answer is presumably falling over, and the subsequent timeout is reported as an error.
My guess is this is related to instant answers section at the top of the search results. It's trying to figure out the count, but it's failing for one reason or another.
Sounds like they are recognizing and redirecting searches to specific functions. It is obvious but with this error we can make a conjecture that some of these functions are written as multiple code snippets without too much engineering responsability.
Just my two cents remembering Google acquisition of Freebase and a quest for understanding content at a higher level (e.g. number of retweets in a tweet).
Hilariously, the third result on Kagi is a direct link to the google search that crashes, I wonder where it pulled this from: https://imgur.com/a/sCmeADy
> We're sorry but it appears that there has been an internal server error while processing your request. Our engineers have been notified and are working to resolve the issue.
Where are you seeing a defense? I only see attempts to explain it or reproduce the error with different inputs, aka things hackers tend to do.
You can find Google bad without losing your mind about it.
My guess: when you search, google fires up several background processes/threads for that query.
Not only search, but also for a quick answers, maps, ads. These services might be slower than the search query or a max response time, in which case they get ignored.
When the search query finished, they might wait a few milliseconds for the other services, if it’s still under the target response time. After that google will render the results with all info that completed.
My guess is that one of these services crashes, most likely the main search one. This way you can wait forever for it to finish.
So it’s probably a bug in the way it’s yielding/waiting for the spawned processes.
Other option is that for certain processes they don’t spawn a background process, because it’s faster to run locally/direct/in-memory, because the data is quite small and just rums on every node.
they’d start this after spawning the other processes, so if this service returns an error, it might screw up the yielding/waiting/collecting, although I’d expect an exception/error page instead of the indefinite waiting that’s happening
Because you are just stating what is obviously happening on a very high-level ("it crashes because something crash"), it’s generic and is applicable to any website. Your comment is empty regarding why it happens only to that query specifically.
I didn’t downvote you, but if folks are coming here looking for answers then a simple “my guess” based (apparently at least) on pure speculation and without inside knowledge of how Google’s setup _actually_ works, doesn’t add much to the conversation.
If your post had included something like “I used to work on Google search”, “I used to work on Bing search”, or “I’ve spotted this piece of data in the network responses being sent to my browser and therefore” you might have gotten a more favourable reception.
Doesn't match my experience here on HN. As far as I can tell educated outsider guesses are generally welcome and - to me personally - sometimes even enlightening.