Closed Bug 1129492 Opened 9 years ago Closed 2 years ago

Firefox content process has a live connection to the X11 server.

Categories

(Core :: Security: Process Sandboxing, defect, P3)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
99 Branch
Fission Milestone Future
Tracking Status
relnote-firefox --- 99+
firefox-esr68 --- wontfix
firefox76 --- wontfix
firefox77 --- wontfix
firefox78 --- wontfix
firefox99 --- fixed

People

(Reporter: jld, Assigned: gcp)

References

(Blocks 2 open bugs, Regressed 1 open bug)

Details

(Whiteboard: sblc4, qa-not-actionable)

Attachments

(2 files)

See also bug 582057 comment #13 for historical context.  We basically can't have a meaningful sandbox for content processes on X11 platforms as long as that file descriptor is there.

Chromium doesn't do this — indeed, it doesn't connect to the X11 server from the renderer process in the first place — so it is possible.  The question is how much work would be needed.
I've used conditional breakpoints to find where we're using the X11 socket, and I'm attaching a selection of stack traces.  It seems to be heavily used in layers/graphics code, and also polled for events in nsAppShell::ProcessNextNativeEvent.
Hi Milan, any idea who knows enough about our linux gfx plumbing to say what is needed here? If this is going to be a big chunk of work it would be good to know about it ahead of time!
Flags: needinfo?(milan)
Big chunk of work, but unclear it's the work that we would want to do anyway.  CC-ing Jeff, he wanted to have the conversation about "why is the sandbox meaningless if we have this connection", though we know it won't be as good a sandboxing with it.
Flags: needinfo?(milan)
(In reply to Milan Sreckovic [:milan] from comment #3)
> CC-ing Jeff, he wanted to have the conversation about "why is the
> sandbox meaningless if we have this connection", though we know it won't be
> as good a sandboxing with it.

A connection to the X server can be used to read from or write to any window/drawable, inject input events, and poll the input devices.  I can see the argument that this would still raise the bar for meaningful exploitation as compared to control of an unrestricted process, but at the same it seems wrong to describe something as “sandboxed” if it can be turned into a clandestine remote access tool and keylogger.
(In reply to Jed Davis [:jld] from comment #4)
> (In reply to Milan Sreckovic [:milan] from comment #3)
> > CC-ing Jeff, he wanted to have the conversation about "why is the
> > sandbox meaningless if we have this connection", though we know it won't be
> > as good a sandboxing with it.
> 
> A connection to the X server can be used to read from or write to any
> window/drawable, inject input events, and poll the input devices.  I can see
> the argument that this would still raise the bar for meaningful exploitation
> as compared to control of an unrestricted process, but at the same it seems
> wrong to describe something as “sandboxed” if it can be turned into a
> clandestine remote access tool and keylogger.

Switching from using XRender for rendering is a requirement for this bug. It is currently planned but not really scheduled.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #5)
> (In reply to Jed Davis [:jld] from comment #4)
> > (In reply to Milan Sreckovic [:milan] from comment #3)
> > > CC-ing Jeff, he wanted to have the conversation about "why is the
> > > sandbox meaningless if we have this connection", though we know it won't be
> > > as good a sandboxing with it.
> > 
> > A connection to the X server can be used to read from or write to any
> > window/drawable, inject input events, and poll the input devices.  I can see
> > the argument that this would still raise the bar for meaningful exploitation
> > as compared to control of an unrestricted process, but at the same it seems
> > wrong to describe something as “sandboxed” if it can be turned into a
> > clandestine remote access tool and keylogger.
> 
> Switching from using XRender for rendering is a requirement for this bug. It
> is currently planned but not really scheduled.

Further, we have the problem of GLX which we don't really have any plan to stop using, nor is there much of a replacement for it beyond using Wayland.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
> Further, we have the problem of GLX which we don't really have any plan to
> stop using, nor is there much of a replacement for it beyond using Wayland.

For reference: Chromium on Linux also uses GLX, but the GL drivers and the X11 connection are in the GPU process, so a renderer process can't directly access them.
Noting for future reference: The MIT-SHM extension might also be a problem, because SysV shared memory follows Unix's “same uid policy” and can't be restricted/brokered like file access.  (It was observed when the initial attempt at a desktop content system call whitelist was made, but that was long enough ago that there could have been significant changes to how graphics work that might make this not a problem, so this should be double-checked.)  There's a not-well-specified revision to use memory-mapped files (http://patchwork.freedesktop.org/patch/15082/) but I don't know what would need to happen to make it work — Ubuntu 14.04 has a new enough X server and should (I think?) have new enough libraries, but X clients still empirically use SysV (including the Firefox parent process).
This should probably depend on bug 1180942
Whiteboard: sb+
Related work on exploiting a sandbox that allows X11 access: https://mjg59.dreamwidth.org/42320.html
Moving to sblc4 which deals with this issue.
Whiteboard: sb+ → sblc4
Component: Widget: Gtk → Security: Process Sandboxing
It would be worth looking into the Xorg sandboxing support (for servers compiled with --enable-xcsecurity) described here https://notehub.org/rp5n2.
(In reply to Haik Aftandilian [:haik] from comment #12)
> It would be worth looking into the Xorg sandboxing support (for servers
> compiled with --enable-xcsecurity) described here https://notehub.org/rp5n2.

I tested this, and as Jed already suspected, it kills performance because GLX and OpenGL no longer work:
GLXtest process failed (exited with status 1): GLX extension missing.

This has a fairly extreme effect on performance for me.

There are also some undesirable side effects: I don't seem to be able to copy/paste from the untrusted process.
See Also: → 1326361
Assignee: nobody → gpascutto
I tried using gdb to see what we're sending to the X server from the content process after sandbox startup.  If WebGL isn't used, the only thing I saw was, whenever a native widget is rendered in a new state, a GetInputFocus request sent by XSync() from this gdk_flush() call: https://searchfox.org/mozilla-central/rev/152c0296f8a10f81185ee88dfb4114ec3882b4c6/widget/gtk/nsNativeThemeGTK.cpp#1217

However, there doesn't appear to be anything that needs to be flushed/synchronized; that GetInputFocus message is the only thing in the buffer being sent to the server.  We seem to be doing the actual rendering to memory and handling it through our own IPC and compositor, which is good.  But it's not clear whether this is something we can depend on, and (if so) in what case that would hold; there's a lot of code that's being skipped due to conditionals like https://searchfox.org/mozilla-central/rev/152c0296f8a10f81185ee88dfb4114ec3882b4c6/gfx/2d/DrawTargetCairo.cpp#2329

It would be interesting, if WebGL is disabled or preffed off, to be able to just close the X11 connection or not even open it in the first place.  Failing that, if GLX is the only thing we need to deal with, then that could simplify the problem of proxying X11.

Another useful piece of information I've been told is that we don't ever use indirect GLX, because WebGL needs a newer version of OpenGL than that supports, so what we'd actually be sending over X11 is just for setting up out-of-band direct rendering, which could also make proxying a simpler problem.
There's also gfxPlatformGtk::GetPlatformCMSOutputProfile, which looks to be relatively easy to handle if needed (probably by sending down from the parent while setting up the content process): https://searchfox.org/mozilla-central/rev/5bb6dfcb3355a912c20383578bc435c1d4ecf575/gfx/thebes/gfxPlatformGtk.cpp#421

And this, which seems to be something about fonts and subpixel rendering: https://searchfox.org/mozilla-central/rev/5bb6dfcb3355a912c20383578bc435c1d4ecf575/gfx/thebes/gfxFcPlatformFontList.cpp#799
See Also: → 1376559
Priority: -- → P3
See Also: → 1503054

It seems like this is coming up again due to Fission. With the high number of content processes, we run the risk of encountering the 256 Xorg connection limit, which can cause crashes. :jesup has apparently run into this multiple times while dogfooding fission with high tab counts.

What remaining features are currently requiring that we create X connections in every content process? If it's only needed for WebGL, for example, it might be nice to delay creating the Xorg connection until a webpage requests it.

Flags: needinfo?(jmuizelaar)

Bug 788319 may help get rid of GLX, if needed.

WebGL can also potentially be remoted if need be. Does anyone know what's currently making X connections?

Flags: needinfo?(jmuizelaar) → needinfo?(jld)

WebGL is most of it and there's work in progress to remote that (see bug 1621762).

We're also initializing GTK in the content process, but we don't use it to draw native widgets directly; as I understand it they're drawn to shared memory and composited. There are still some unnecessary calls to XSync when that happens, but otherwise normal non-WebGL operation doesn't seem to actually talk to the X server. However, starting content processes in headless mode means no widgets are drawn (most noticeably scroll bars), and the last time I tried that, enabling :spohl's non-native widgets with widget.disable-native-theme-for-content didn't help. And I don't currently know enough about graphics to say what might be going on there or what it might take to fix it.

Flags: needinfo?(jld)
Fission Milestone: --- → ?

Perhaps we can just switch to calling gtk-parse-args instead of gtk_init() to avoid the display connection?

I tested MOZ_HEADLESS again, using the same patch as my previous attempt (just the existing headless mode, no gtk_parse_args), and it seems to work now — scrollbars are drawn (same as non-native + non-headless), and form controls seem to work. Headless content + native GTK widgets continues to not work, but that's expected. And WebGL is broken, but that's also expected.

I confirmed the absence of X connections with a patch to log child processes' sockets before sandbox startup. I also observed that we normally make two connections to the X server per content process (a finding also reported in bug 1635451): one from gtk_init called by ContentChild::Init, and the other from XOpenDisplay called by gfxPlatform::Init; both are held open for the life of the process.

I haven't tested extensively, but if there aren't other regressions, this seems useful for use cases that don't need (or actively don't want) WebGL, like Tor Browser.

So I did some bisecting, and headless non-native widgets appear to have been fixed in bug 1619664 (specifically the “Decide which theme to use per document” patch), then regressed in bug 1615026, then fixed again in bug 1620246.

The second broken range does look like a coordinate system problem: scroll bars are blank but occupy the expected size, and form controls are displaced and partly cut off by their bounding boxes.

With the original breakage, the absent scroll bars are the narrower GTK width… because it was ignoring the pref and using GTK in the headless case, until bug 1619664 reordered the conditionals. So non-native widgets have probably always worked in headless-content mode.

See Also: → 1636493

Tracking for Fission Nightly milestone M6c because this bug causes crashes with many tabs. This problem affects all Linux users, but Fission makes it worse because we'll have more processes.

The fix is probably a lot of work so we should start soon for Fission.

Severity: normal → S2
Fission Milestone: ? → M6c
See Also: → 1636555

(In reply to Chris Peterson [:cpeterson] from comment #23)

Tracking for Fission Nightly milestone M6c because this bug causes crashes with many tabs. This problem affects all Linux users, but Fission makes it worse because we'll have more processes.

The fix is probably a lot of work so we should start soon for Fission.

There are two problems here:

  • Remove X11 connections from content processes in all cases, for security, which is this bug
  • Avoid X11 connections in the common case, to avoid exhausting resource limits, which is bug 1635451

This bug has two main blockers at this point:

  • WebGL, which needs to be either remoted (bug 1607940) or changed use DRI directly of using X (bug 788319, if I understand it correctly)
  • Flash, which will be removed eventually but probably not before Fission is intended to ship

The first one of those has two potential solutions actively being worked on; I don't know how close either one is to being shippable. The second is more difficult; the NPAPI code has a number of dependencies on the content process being an X client (search nsPluginInstanceOwner.cpp for uses of Display* and Window for some examples; see also bug 1548475) and I have no idea how feasible it would be to change that.

If Fission Nightly is intended to ship before we remove support for Flash, then it will likely be necessary to figure out a smaller-scoped solution for bug 1635451.

(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #24)

There are two problems here:

  • Remove X11 connections from content processes in all cases, for security, which is this bug
  • Avoid X11 connections in the common case, to avoid exhausting resource limits, which is bug 1635451

Sounds like this bug might not need to block Fission Nightly, but bug 1635451 (avoiding X11 in the common case) would be good to fix before Nightly. I'll move this bug to Fission milestone M7 (Beta) so we continue to track this work without blocking Fission Nightly.

  • Flash, which will be removed eventually but probably not before Fission is intended to ship
    ...
    If Fission Nightly is intended to ship before we remove support for Flash, then it will likely be necessary to figure out a smaller-scoped solution for bug 1635451.

We currently plan to remove Flash support in Firefox 84 (in Nightly in October, Release in December). If Fission is indeed ready to roll out to (some percentage of) Nightly users before October, we can disable Flash for those Fission users even before Flash EOL in Fx84. Flash is going away soon and we don't want any extra Flash work to delay Fission development or rollout.

Fission Milestone: M6c → M7

As :cpeterson mentioned, we explicitly have no intention of supporting Flash with Fission enabled, so it's OK to land code which breaks NPAPI when Fission is enabled.

I've been told there are plans to implement WebGL remoting in Firefox 80 (July) or 81 (August). Tracking for Fission Nightly M6c.

Fission Milestone: M7 → M6c
Depends on: 1642463

To help alleviate this bug, we only need to disable Flash for Fission Linux users (bug 1642463), not all users on all platforms (bug 1455897).

No longer depends on: remove-plugin-support
Depends on: linux-nnt
Depends on: remove-plugin-support
No longer depends on: 1642463
Blocks: fission

Pieces of this are slowly getting into place. If you enable all of these:

Then in theory you shouldn't have an X connection. So it's a matter of stabilizing/finishing and shipping these features.

I think we should remove all unused / legacy X11 code from the codebase as I think the only way forward for HW rendering is GL rendering. So we should support only SW fallback and GL. Karl, what do you think? Do you agree to remove the stuff like XRender, cairo XLib surfaces?

Flags: needinfo?(karlt)

(In reply to Martin Stránský [:stransky] from comment #30)

... Do you agree to remove the stuff like XRender, cairo XLib surfaces?

I personally can't agree more, but it would great to have some numbers about affected users - apparently there are still some out there, e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=1280795#c6

https://www.scm.com/doc/Installation/Remote_GUI.html

OpenGL with X11 can run in two different modes: direct rendering and indirect rendering. The difference between them is that indirect rendering uses the GLX protocol to relay the OpenGL commands from the program to the hardware, which limits OpenGL capabilities to OpenGL 1.4.

My naive understanding:
When using X over the network (X11 forwarding), you can choose between two GLX modes: Direct Rendering on the remote computer you are accessing, or Indirect Rendering on your computer by transmitting GLX commands over the network. Indirect rendering supports OpenGL 1.4 at maximum, and Firefox anyway wants to move to EGL (bug 788319). WebRender requests GLX with OpenGL 3.2 or EGL with OpenGL 3.1, so in any case direct rendering (or software rendering) has to be used and the end result be uploaded over the network.
From reading on Bugzilla, XRender seems to help with scroll performance of such previously painted Basic compositor tiles(??). WebRender doesn't have such tiles by default and it seems the external compositor trait couldn't be implemented for X11 as you can't composite all surfaces "with a single atomic transaction" (bug 1617498 comment 0).

Regular remote desktop tools use video codecs, but X seems to transmit in raw or when using SSH compression (-C) still half the size of raw.(?)

https://stackoverflow.com/questions/12977879/ssh-compression-for-x11-forwarding

X11 graphics take up a lot of bandwidth. If your remote host is some distance away (i.e. not on the LAN), then you'll probably suffer sluggishness in your exported X11 applications.
Instead of interacting with the GUI using X11 forwarding, consider something else that has better optimization/compression, such as VNC or NX/FreeNX.
Actually well written X11 programs don't comsume a lot of bandwidth. The main issue with X11 is the abysmal badly written Xlib which performs a full synchronous roundtrip for almost every operation, while many operations could be performed asynchronously. If using another X11 protocol backend, like Xcb and not doing moronic things like rasterizing the whole GUI on the client and then just blitting the damn thing over the network, plain X11 over TCP over a low bandwidth, high latency connection is actually very usable. However many programs are written badly.

https://superuser.com/questions/1217280/why-is-x11-forwarding-so-inefficient

X11, by default, doesn't do any compression on the network data that gets passed between the application and the display server. As you mentioned, you can use the -C option on SSH to enable compression, and while this does help, it's not optimized for compressing graphical data. Compared to formats like H.264 which can obtain compression rates of 100 to 1, the -C compression will be lucky to achieve 2 to 1 compression. A better solution is to use a graphics or video optimized codec for the X11 video, but we have to be careful not to make it too lossy as desktops generally need to have sharper images than a movie so the user can still clearly read the text and make out fine details.

https://unix.stackexchange.com/questions/286691/more-efficient-x-forwarding

https://superuser.com/questions/1400952/x11-forwarding-and-efficiency

  1. Over time this changed, programs started doing the rendering by themselves, and many of those instructions became just "here's a bitmap which I already rendered, put this on screen" – very fast locally, but very slow over the network due to X11 lacking any image compression.

This means programs built using modern toolkits are much slower over networked X11, even if it's something as basic as antialiased fonts.

(In contrast, RDP has adapted over time and includes various forms of image compression such as JPEG and even H.264; you can often notice the compression artifacts while the full image is loading.)

  1. Fortunately, most UI toolkits such as GTK implement damage tracking so only updated regions are re-sent. However, a few programs (such as several Firefox/Thunderbird versions), don't support this and request a complete re-render of the entire window, even if it hasn't really updated.
    That's another difference between programs – well-behaving ones are still quite usable over the network, but buggy ones can saturate a 100 Mbps link doing absolutely nothing useful.

To 3: Partial present seems to work with Basic. Was it broken in February 2019 or just not enough in regard to scroll performance?
Anyway, if users choose X11 forwarding, they consciously or unconsciously prefer a technically bad solution over more performant and more secure ones (regular remote desktop solutions).

I think maintenance cost of Xrender as deprecated and technically inferior edge case is unjustifiable in the light of Linux user benefit, marketshare and Mozilla's aims. We regular Linux users don't profit from it at all. Those few commercial thin client users who want to keep absolutely best performance should invest into using regular remote desktop solutions that use video codecs and are more secure. If even RedHat is no longer interested in maintaining deprecated Xrender, then why should poor Mozilla bear the costs/risks?

I would agree with the observation that GL is preferred over XRender for HW rendering.

I assumed the XRender situation would already be poorly supported or at least inefficient with e10s, so I'm surprised that it is still used successfully.

I'm not able to weigh in on the VNC / X11-forwarding argument, but I doubt we could justify the maintenance burden of XRender just for X11 forwarding, particularly if VNC is a viable alternative.

We should keep some fallback for when neither GL nor XShm is available, which may mean keeping some basic xlib surface concept in the parent process.

Flags: needinfo?(karlt)

I think we should remove all unused / legacy X11 code from the codebase as I think the only way forward for HW rendering is GL rendering.

I don't understand the relevance to this bug? I think you want a separate bug about removing it for HW rendering, but that doesn't seem to be the only case for which Firefox currently uses the X connection.

(In reply to Gian-Carlo Pascutto [:gcp] from comment #35)

I think we should remove all unused / legacy X11 code from the codebase as I think the only way forward for HW rendering is GL rendering.

I don't understand the relevance to this bug? I think you want a separate bug about removing it for HW rendering, but that doesn't seem to be the only case for which Firefox currently uses the X connection.

The relevance is that when we're fine to draw Gtk widgets to image surfaces it may be possible to render native theming without X connection as we use GtkStyles for that and create the widgets from widget path. I'll look at it. Anyway, the connection from gtkPlatform was removed in Bug 1657315 so there's only one left (when GL is not used).

Depends on: 1657315

(In reply to Martin Stránský [:stransky] from comment #36)

The relevance is that when we're fine to draw Gtk widgets to image surfaces it may be possible to render native theming without X connection as we use GtkStyles for that and create the widgets from widget path. I'll look at it. Anyway, the connection from gtkPlatform was removed in Bug 1657315 so there's only one left (when GL is not used).

Hm, looks like we can't create GtkStyleContext without a display connection / screen so we can't use Gtk to render in headless mode.

Depends on: 1672013

Nika doesn't think removing 100% of X11 from the content process needs to block Fission MVP. Non-native theming will remove the biggest use of X11 in content processes and that's probably good enough for Fission MVP.

Fission Milestone: M6c → Future
Blocks: 1672013
No longer depends on: 1672013

Hey Jed,
Can you still reproduce this issue or should we close it?

Flags: needinfo?(jld)

Yes, this is still a live issue. It's blocked on bug 1638466, which needs more work to deal with GPU driver issues as I understand it.

(Note that bug 1635451 avoids keeping the X11 connection open if it's not needed, but access is still allowed.)

Flags: needinfo?(jld)
Whiteboard: sblc4 → sblc4, qa-not-actionable

Background: The X11 protocol has a very permissive security model;
clients have essentially full access to the windows of other clients,
and to global resources like input devices. Previously, our sandbox
policy for content processes needed to allow access to the X server;
this limited its effectiveness against a dedicated attacker.

This patch turns on the security.sandbox.content.headless pref added
in bug 1640345, which removes the sandbox policy rules that allowed
making new X11 connections, as well as opening the Xauthority file,
reading hardware info needed by Mesa, etc. It also runs content
processes in headless mode (whence the name) so they won't connect to a
display server at startup.

This also removes access to the Wayland compositor: the sandbox policy
never allowed that (as of when socket connections became default-deny),
but now content processes won't connect to it at startup. Wayland is
more capability-oriented so this is less significant for security, but at
a minimum it removes unnecessary attack surface.

Note that if the webgl.out-of-process pref is turned off, WebGL
will break unless security.sandbox.content.headless is also turned
off. (Similarly, widget.non-native-theme.enabled is needed to render
scrollbars and form controls in content.) As a result, this patch
adjusts the job definitions used by CI to test in-process WebGL so that
that they will continue to work.

Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1edd1e4056e6
Remove X11 access from the Linux content process sandbox. r=gcp,jgilbert
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 99 Branch
Regressions: 1758354

Release Note Request (optional, but appreciated)
[Why is this notable]: Important improvement in security for majority of Linux users (on older distros).
[Affects Firefox for Android]: No.
[Suggested wording]: The Linux sandbox was strengthened: processes exposed to web content no longer have access to the X Window system (X11).
[Links (documentation, blog post, etc)]: TBD - May blog about this depending on win32k lockdown rollout.

relnote-firefox: --- → ?
Regressions: 1803754
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: