Let me make sure I get this right: if you have two chains, one is mined by 49% hash power, the other is mined by 51% hash power, you think the 51% one won't be the longer one?
As I've explained in nearly every comment, one is mined by 100% hashpower, the other is mined by 51% hashpower. The 49% are still using the other blocks, too.
This is where your mistake is: they can't be using the other blocks as well, because they are erasing their own (uncensored) transactions by doing so.
Let's say that MiningPoolA controls 51% of hashpower, and MiningPoolB controls 49%. MiningPoolA is refusing to mine transactions from/to some wallet W.
Time T1:
MiningPoolA: OldChain -- Block1(no W)
MiningPoolB: OldChain -- Block1'(W receives 1BTC) still in progress
Any client will accept the chain with Block1 (no transactions from/to W)
Time T2 - if MiningPoolB tries to compete
MiningPoolA: OldChain -- Block1(no W) -- Block2 (no W)
MiningPoolB: OldChain -- Block1'(W receives 1BTC) -- Block2' (parent=Block1') still in progress
Any client will accept the chain with Block1 (no transactions from/to W), and MiningPoolB will never be able to catch up
Time T2 - if MiningPoolB decides to accept MiningPoolA's chain, but add the transaction in the second block:
MiningPoolA: OldChain -- Block1(no W) -- Block2 (no W)
MiningPoolB: OldChain -- Block1(no W) -- Block2' (parent=Block1, adds transaction W receives 1BTC) still in progress
Any client will accept the chain with Block1 (no transactions from/to W), and MiningPoolB will never be able to catch up
If a mining pool had 51+% of hashpower, they would always be mining the longest chain, no one would be able to compete with them and publish another block (in principle, at least; in practice, since mining is not entirely deterministic, someone else will occasionally win the lottery and propose a new block faster).
It would depend on how close the attack is to 51/49%. If it's close, the 49% chain should be able to keep ahead, making the attack take a long time and thus be very expensive... but there would be constant block reorganisations.
The 51% chain is mined with 51% of the total mining power, while the 49% chain will contain 100% of the mining power (as the 49% miners are more than happy to build off the 51% chain, while the 51% miners will only building off the chain without the censored transactions).
They are correct in asserting that a 51% attack to censor transaction is largely just a nuisance to the users and that all transactions should eventually end up on the longest chain (the 49%).
Any block you are working on has a reference to the latest block in the network. So, every miner starts from scratch once a new block is added - the 49% pool doesn't have any advantage compared to the 51% pool.
Each miner takes a bunch of transactions and chooses a previous block B1 to build on, then starts hashing. If some other minerB advertises a new block B2 based on B1 that includes different transactions before minerA, then minerA can throw away all the work it did, and start from scratch on B3 based on B2. But minerB will probably already be working on its own B3 - and has every chance to win again.
Except, that minerA can't build off block B1 because it has transactions in it that it doesn't agree with. minerB just has to keep trying to find blocks to put the censored transactions into, while minerA is fighting against the rest of the miners to outpace them. This only really becomes possible at much higher thresholds, e.g. 70/30. Meanwhile, the community will see what is happening and steps may be taken to either soft-fork or hard-fork the bad actors off the network.