It is pretty hard to implement a container with all the precise invariants and guarantees that the Standard requires.
But more to the point, your implementation might still not be as fast as the standard library one, because the standard library can make assumptions about the compiler that you cannot in portable code - what is UB to you might be well-defined behavior to stdlib authors. Thus, for example, they might be able to use memcpy for containers of stdlib types that they know are safe to handle in that manner.
Meanwhile, do you believe it's hard to implement a container?
And no, adding cruft to the STL is not a one-way street. See for example the history of C++'s smart pointers.