Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft's Calibri font is at the center of a political scandal in Pakistan (engadget.com)
252 points by okket on July 12, 2017 | hide | past | favorite | 111 comments



MS Calibri font was also part of a debate in Turkey: A CD, claimed to be produced in 2003 and used as an evidence (about a coup) in a court case in Turkey, contained MS Doc files written with Calibri font.

The case neverthless went ahead (i.e., the file used as an evidence), people got sentenced but later all claims were dismissed and judges either left the country or in jail now.

Full detail in following post from Dani Rodrik - a professor at Harvard Kennedy School:

http://rodrik.typepad.com/dani_rodriks_weblog/2012/10/did-mi...


Reminds me very much of the kerfuffle over the alleged George W. Bush Air National Guard memos that ultimately got Dan Rather canned.

https://en.m.wikipedia.org/wiki/Killian_documents_controvers...


And deservedly so. Years later, he still defended their use despite all the evidence to the contrary: http://www.hollywoodreporter.com/live-feed/dan-rather-stands...


Well I tried to find that evidence... got tired of chasing references, never did get to the slam-dunk evidence that it couldn't have been typed on a selectric, if it exists.


This was one of those (rather common) cases where no one could really get to slam-dunk evidence that it couldn't have been typed on a selectric (especially given they weren't original copies) but any reasonable interpretation of the evidence suggested that it was far more likely to have come out of Microsoft Word on default settings. (Per many forensic document experts.)

Other aspects of the documents looked sketchy as well upon further examination.

A lot of other news organizations were really pissed because there were some questionable aspects to GWB's National Guard Service but those were pretty much off the table for discussion when the apparently forged documents were discredited.


> creator Lucas De Groot, who seemed skeptical of the font's use before its public release. "Why would anyone use a completely unknown font for an official document in 2006?"

> Microsoft's website states that version 1.0 of the font was available to download separately as far back as 2005. And, according to font consultant Thomas Phinney, Calibri was also available as part of a Windows pre-release in 2004.

Back then many people were using beta-versions of Windows 2003 Server (or other server products) . Server edition was great desktop (compared to XP and Vista), but also very expensive. Prerelease version had 90 days trial period, which could be easily reset.

MS Office format is basically a memory dump. So forged documents should leave much more evidence with timestamps, office patches versions, error handling etc...


It seems as if the bottom line is that someone could have gotten their hands on Calibri for Windows at an earlier date than 2007. However, given this case, it seems rather more likely that they used the default font on the released product at a later date. Like most evidence, you can come up with alternate theories, but usually the obvious conclusion is the right one.


The fact that it was indeed publicly available before 2007 kind of throws the whole thing out of the window, though.

Even if it seems obvious, how are you going to prove that they didn't actually download the font or have the special Windows edition?

Although it might turn public perception, it shouldn't, I believe, be considered judicially.


This is what standards of proof are for. "Beyond reasonable doubt," "preponderance of the evidence," etc. You use different standards for different types of trials. In one trial you may ask, "is it more likely that this is a forged document using the default font, or that the user went out of their way to download an obscure font before its public release and specifically chose to use it?" In another trial with a different standard you may ask, "is there any possible way that this document was not forged?" The answer to each question is different, which is why we declare the standard of proof before the trial begins.

Of course, I do not know the standard in use in this country's court system.

https://en.wikipedia.org/wiki/Legal_burden_of_proof


Well, in politics as opposed to a criminal trial there's no such thing as burdens of proof, just whether a scandal has legs.

Among American (and British, Canadian etc) lawyers who advocate quantifying the concept of "reasonable doubt,"[1] the consensus figure seems to be that proof beyond a reasonable doubt means greater than 95% certainty (or, stated negatively, a less than 5% level of uncertainty). Judge Weinstein of the Eastern District of New York[2] puts the 95% number in his jury instructions, for example.

If we assume the Bayesian prior that it's at least 19x more likely that Pakistani politicians at the highest level of influence are secretly corrupt and/or evading taxes, than it is that they're secretly tech nerds who eagerly download the latest Microsoft betas and try out all the new fonts, then we have met that 95% standard. If the accused were a Microsoft executive instead of an official of one of the world's most corrupt states,[3] the equation would be different.

Of course, there might be other arguments for doubt besides "the font was available in obscure beta software." One potential argument is that it's a newer version of an older document that was typeset in a different font, which was changed to Calibri at some later point (recent MS Words might default to displaying the document in Calibri if it were sent to a computer that didn't have the original font installed, or the file type doesn't include font information in a way Word supports, or it was copied and pasted from the clipboard).

[1] Which IMO it should be; most of the clear miscarriages of justice in either direction that I know of seem like they could have been prevented or at least been mistrials if, as much as possible, we'd nudged the jury to carefully quantify the strength of both sides' evidence instead of voting emotionally.

[2] https://en.wikipedia.org/wiki/Jack_B._Weinstein

[3] https://en.wikipedia.org/wiki/Corruption_Perceptions_Index


I begin to realize the extent to which mere eccentricity (say in choosing fonts) is already deeply illegal (once you factor in how many decisions we all make that put us in a 5% box.) No wonder people herd.


If this were the U.S., I'm pretty sure the use of Calibri as evidence of forgery would fail to be beyond reasonable doubt.

This is not the U.S., though, and I ain't a lawyer anyway.


I expect it would depend on what other evidence there were. I tend to agree that taken by itself it's a pretty thin reed to get to beyond reasonable doubt as would probably be needed in a criminal trial in the US. However, it could certainly help a case.

It's not unusual to have trials where there's no unassailable smoking gun but rather a number of facts that strain credulity to explain away.


> So forged documents should leave much more evidence with timestamps, office patches versions, error handling etc

It looks like we may only have seen scanned copies of the documents, so any Office metadata would be missing.

I wonder if there is any other subtle metadata. Perhaps default tab stops or borders?


printer steganography?


Unfortunately i don't think that would have much bearing on the origin of the content pre-printing.


Except printer dots are a printer specific serial number. If it was printed on a recent printer then that might be concrete proof of forgery (i.e. if the supposed original document was submitted as evidence as having been contemporaneous of the document's date but was shown to have been printed much more recently because of printer dots).

If you see what I mean...

And I'm not really commenting on the current case (not familiar with it at all)... just positing a possible way to prove a document's date as inaccurate.


Isn't the file format a zipped XML nowadays?

It'd be hilarious if they said "Here's my 2006 tax report, but you need Office 2013 to open it."


It is, but many of the XML values and attributes are just dumps of bitfields and identifiers that are structured exactly as they are in memory. In other words, the XML format was mostly a PR move to stave off accusations of not being "open" enough, and the people who implement support for the new format still have to reverse-engineer a bunch of Office-specific binary nonsense.


Office97 was binary format, XML comed much latter as response to OpenOffice


Open XML formats (.docx, etc) came with Office 2003.

From what I remember, the EU wanted to standardize an open document format, and had their eye set on OO ODF, as the most mature of those at the time. Microsoft put out their OXML format, gave out free patent grants, lobbied hard, and managed to get it into ISO.

If you want some nightmares though, open an office XML file in a text editor some day.


> Open XML formats (.docx, etc) came with Office 2003.

They came with Office 2007. For Office 2003, there was a plugin that allowed to open the XML files, in a limited way.


No, there was a separate XML format for 2003 before OOXML, called WordprocessingML. Basically nothing else ever used it, but later MS Office and LibreOffice still read it.


I wrote a tool to convert pptx files to markdown. Using python it was maybe a 3 hour project. I'm sure some of the more nuanced formatting is a challenge, but the format really isn't bad and keeps the content quite manageable.


You're not going to leave us hanging, are you?



That is in fact what I was referring to, although I don't think it should still be up.

I wrote that and on a whim thought, "I'll make this a pip package," severely underestimating that task. The tool works but I never finished that packaging. The working source should be on github if you're interested.


If you want some nightmares though It is not really all that bad. The hardest part of pulling it all out and reassembling it into what you want is the shared strings aspect.

It is really quite distinct from the memory dump of pre-xml doc/excel files.


For real. On Palladium/NGSCB there was a strong effort to maintain correspondence between documentation and code, to the point that there was an effort to have header files be generated directly from the specs, which were Word docs. The biggest practical challenge was that the extractor had to instantiate Word just to read the text content of paragraphs with the specified style. This is not something that you want in your build pipeline if you can avoid it.


Ewwww.

I woulda cobbled something together to suck the text out of them.


Hahaha, me too, but the old Word format really is as bad as everyone says, and we were trying to build a system with formal correspondence from spec to code (and sometimes from spec to proof to code), so having a "cobbled" together something really didn't fit the model.

The real problem was using word as our documentation format, but at Microsoft in the early oughts there really weren't many alternatives.


Well, I do believe LaTeX was available.

The cobble-together part, to be successful, would pull the text out reliably. And the format is readily documented, and Open/Libre office processes it as well. The code to do the extract might be ugly, but so long as it reliably produced the text in a CI/CD environment, that would be OK.


The constraints were entirely organizational, many better technical solutions were suggested


Comed?


The deduction that some document must be a forgery because it's dated to 2006 and the font it is written in wasn't available until 2007, sounds like something out of a Sherlock Holmes story.


Sounds reasonably quick to observe. Lawyers work with documents everyday.

On the basic training I've had in spotting fake documentation, "check the font seems right" is near the top of the list e.g. in regarding fake company letter-heads.


I believe you. I just thought that some quick deduction like that would be well placed in the Sherlock universe.


Oh yes, absolutely. Would fit in perfectly. Pure deduction ;)


This is a reasonably standard approach of forensic examinations.


This is not the first time folks get in trouble because of pc fonts. A high profile case is that of Dan Rather, who had to resign from CBS News after the Killian documents controvery: https://en.wikipedia.org/wiki/Killian_documents_controversy


Would be interesting if the document also has the "tracking dots" printer companies use.


All word processors should change their default font every at major version so we can have #fontgates more often.


Apple is already doing so. :]


Nawaz Sharif and family are clearly lying, people (specially deed writing Govt sectors) wait few years after stable release. The font shipped in 2007 and I can bet nobody installed vista right away.


They should have stuck with Comic Sans.


For sensitive documents, nothing beats Wingdings.


That's cyberpunk with vaporwave flavor again.


Best comment ever posted on this site hands down. Not even being sarcastic. Stay cyber, stay punk.


For those who are interested, this is the text of the full report:

https://drive.google.com/file/d/0B6leBB47NfItbHQwa3c4d3E0QmM...

(for those wondering why it is in English, that is one of the official languages in Pakistan).


Pakistan holds promise as a country and an emerging economy.

Their biggest problem: the country fails to attract foreign capitals, and the local noveaux riche never keep their money in the country.

To many it is a mystery why a country that holds an incomparably less protectionistic stance than its neighbors, has retained much of British legal system, and is more welcoming to foreign capital policy-wise fails so hard at that.

One explanation that I believe explains more than just saying "it is because of crazy mountain people surrounding the country," is that the perpetual uncertainty over power succession, forces rich people to move their capitals abroad affecting trade balance, and creating an illusion of very low capital appreciation in the country, this in turn tell foreign investors that the country is lost cause for conventional investments.


Are you serious? The biggest problem of Pakistan is failure to attract foreign capital? What about harboring terrorist groups like LeT, JeM, and AlQ? Or, the fact that religious strife is killing innocents - sunnis killing shias, muslims killing hindus. Or, that the army is not under the control of civilian goverment, leading to multiple coups. I can provide hundreds of references to back these claims. Lack of foreign capital is probably the last problem for Pakistan.


>Are you serious

I'm totally serious Virtuabhi

In total, the amount of lives lost to fighting in the north, and with internal oppositionary movements amounts to around 4300 people over 10 years - that digit is from Hina Rabbani Khar. This is not much for such a populous country.

Armed struggle rarely reach big cities where the majority of business activity take place. On my observation, something big only happens in big cities once a year or less.

I do not believe that OBL, AQ and co. count for more than a small nuisance.

No armed group ever came close to trampling the central government, nor will they ever be in the foreseable future.

India is a genuine issue for the country. Not solely because of its military ambitions. The constant economic and diplomatic pressure from India is the reason for much of international isolation of the country.

The army, and previously regular army coups could've been seen as an issue in the past. In my own observation, I see that army's freedom of action was already on the downward trajectory during Musharaf's reign, and that was partially responsible for him falling out of power so fast (other being that Zia's era generals simply died of old age). The civil society (Pakistani oligarchate), is in antagonistic relations with the army. They are squeezing them out of power, preventing them from gaining any momentum. Generals are not coming back to power in our generation. Big buck has won over the military.


Is it totally irrealistic to expect better relations between India and Pakistan in the future?


As a Pakistani that sees the bigger picture, I agree with you. It's a popular narrative in Pakistan that a lot of the country's problems are caused by RAW, the CIA, or any other three lettered acronym, and that a lot of our problems could simply be solved with foreign money.


I specifically remember using Calibri quite a while before it became the default in MS Office. It looked nice, and it was different from all the Arial and Times New Roman out there.


Wasn't there an article that says some printers were printing small marks on the document that provides some info on when it was printed, model, etc.? I suppose the printer during those times may not have the capability yet.

Found the HN discussion[1] related.

[1] https://news.ycombinator.com/item?id=14501894


Color laser printers. Most places using laser printers, especially for printing documents, would usually use B&W printers.


While I believe there was probably fraud involved, they could have been using the volume licensed version released in 2016 or the beta (2015) of MS Office.

Piracy of both is rampant in Asia.


The document in question is of property bought in London and all documents prepared there.


> The report charges the PM with perjury, hiding his wealth, forging documents, and living beyond his means.

You can be charged for living beyond your means?


I don't know the details of Pakistani law, but it seems like it could be a clever judicial technique for fighting corruption or tax evasion in cases where it cannot be directly proven. If you spend more money than you lawfully earn, then presumably the rest have to come from illegal activities (assuming no fortune)?


Or are in debt?


The debt should also be verifiably documented. Otherwise one can claim any money as a loan from friends.


In Brazil you can. Idea is get corruption.

For example, we had a governor that had as wages about 10000 usd month... yet was spending millions.

Not even the craziest bank would fund that, so it must have been corruption. (It was, he received if I remember correctly about 100 million usd in bribes)


In India, you can. Having assets dispropotionate to known sources of income creates a presumption of corruption, plus one can also be charged for tax evasion.

http://www.thehindubusinessline.com/opinion/disproportionate...


I guess the government will soon be... Sans-Sharif

(sorry)


Very well done, original humor always welcome!


Nice one


Reddit one-liners is why I hate HN threads lately


I can understand your frustration but a downvote would have sufficed. If you hadn't seen the HN guidelines, take a gander when you get a spare moment. https://news.ycombinator.com/newsguidelines.html

Most notably this:

> If your account is less than a year old, please don't submit comments saying that HN is turning into Reddit. It's a common semi-noob illusion, as old as the hills.


My account is a several years old. HN is turning into Reddit. These one-liners really are a lot more common than they were even 2-3 years ago.


As right as you may perceive yourself to be, complaints are less substantive than the original comment is.


If you really believe that, then your own comment is even less substantive than mine.

The meta-discussion has some substance to it IMO; it conveys at least some information and might suggest some actions to take. The original comment has literally none.

(Meta-discussion is marginally poisonous to productive discourse, and on an ideal HN this whole thread would be downvoted, but one-liners like the original post are a much bigger threat)


> but one-liners like the original post are a much bigger threat

If your ideals can be so easily threatened then perhaps one-liners are not the problem.


Above-average discourse is fragile almost by definition - mean-reversion is a real thing.


I agree. If I wanted r/technology, I'd go to r/technology. Please, fellow users of HN, downvote jokes.


I never have liked this "HN is serious business" mentality that treats minor humor as a negative to be excised. It makes the community seem very dour.


I like dour. That mentality is one of the things that makes HN different from other communities, and one of the reasons I like HN more than those other communities; I'd rather HN retained this distinctiveness than reverting to the mean.


I always thought it was the smart people, and interesting posts, rather than a lack of anything resembling a smile.


I don't think they're so easy to disentangle.


You sound like you wouldn't be much fun to hang out with at parties.


Personal attacks aren't ok here.


I'm sorry, you are being downvoted for your reference to "parties". Please refer to the HN Discussion Code of Conduct, section 4 subsection 3(c.R) which states that Real Hackers do not participate in, discuss, or even acknowledge social gatherings known as "parties" except when discussing serious scientific reports centered around research into these social phenomena. This is your only warning.


I've been on reddit for over 5 years and have only been on HN for about 8 months.

HN is definitely completely different from reddit. I like the occasional joke or pun, but reddit takes it too far. The same tired jokes are repeated over and over and yet still get upvoted.


I'm glad we're talking about fonts because I have a question that I want to ask this audience-

How many of you insist on choosing your own fonts in your websites, rather than just specifying families? I can understand if the fonts you choose are fancy. But otherwise, as a power user who has set default fonts for various families carefuly, I dislike having to read stuff in any font except my own carefuly chosen ones.

I hope websites defered to users somehow on this.


I think things like Google Fonts have lead to an overuse of stylistic fonts and that should be addressed. But not for the reasons you're stating. 99.9% of all users don't care about setting their own fonts for websites or even know that that's possible. I'd rather let the company/website that paid for a designer to come up with a solid typography dictate how the site should look than trust my programmer's eye for design (which is to say I'm blind as a bat in that regard).

The real reason over use of fonts is dangerous is because it's a huge hit to site performance if not done properly. I've seen way too many sites that are yanking in Google Fonts all willy-nilly with zero regard for typography or performance. That's bad, but it's almost always the result of an inexperienced "nephew that knows HTML" or someone with access to a really shitty WordPress template.

And since the vast majority of people have zero interest in tweaking fonts themselves (myself included), why not use fonts other than the handful of (imo) boring system fonts?


> why not use fonts other than the handful of (imo) boring system fonts?

Ayup. I use Oxygen on pretty much everything because I really like how it looks and it's part of the distinctive style I impart to the stuff that I'm building. The number of users who have really strong opinions about fonts versus the number of users who either consciously or unconsciously appreciate a good-looking design is not a ratio in favor of caring about the former.

(Also agreed on performance; that's why I use one font family and use it from Google Fonts--because it caches.)


I'm obsessed with speed on my main site (whatismybrowser.com) so I simply use default system fonts, in order to minimise HTTP requests and download time.

It's quicker to just use system fonts than have to have the user make a trip to Google's Font Servers.


If you have such strong feelings about it, you can set your browser to only load fonts you like.


Why do you allow websites to set their fonts? Dark Reader chrome extension


Nice trick, Fx has that built int too!


Pakistani political establishment is hilarious


I guess today 2017 "font" now equals "typeface."


You're decades late with this observation. It happened when the first GUI word processors labeled the setting as "font". Users are not typography experts and will call it as the software does.


> I guess today 2017 "font" now equals "typeface."

As a non-native English speaker I really don't understand the subtle difference between these words.


https://www.fastcodesign.com/3028971/whats-the-difference-be...

> Even among type professionals, there’s a growing acceptance that for most people, the terms font and typeface can be used interchangeably. Only experts really need to worry about it.

> “For most people in most situations, those terms can swap around without any trouble,” Tobias Frere-Jones tells Co.Design. “The distinction would matter in type design, obviously, but also contexts which involve engineering, like app development or web design.”

I'd agree with that. To save you reading the entire article, here's the rough definition presented in it:

Typeface refers to the set of all the distinct shapes. It's what most people mean when they say "font" in everyday life.

Font refers to a specific weight at a specific size.

Even as a web developer I find that distinction confusing because a single "font" file will usually support any number of sizes and can be rendered in different weights or even as oblique, although of course the result will be less sophisticated than using the "hand-crafted" weight variants or italics. And of course there are file formats that can actually contain more than one such "font" (even from different typefaces).

So outside of actual type design or technical specifications I would say that the terms should be treated as interchangeable because nobody can be bothered using them correctly.

I'd actually argue even type designers don't care much what you call them. Or at least that seems to go for the kind of type designers that actually design types rather than correcting people on the Internet.


> Typeface refers to the set of all the distinct shapes. It's what most people mean when they say "font" in everyday life.

> Font refers to a specific weight at a specific size.

That's a much better summary than the article's "In brief:" sentence.


UK: a font was originally a fount of type (from fountain).

http://britishletterpress.co.uk/type-and-typography/type-syn...

Above dates from the days when letterpress was still around (my local letterpress based jobbing printer closed his shop in the late 80s/early 90s because of repeated break-ins after his lead). Article tends to suggest a fount would be a given weight/style of a typeface.


Typeface is the design, font is the implementation. In movable-type printing, a font is a set of metal slugs, each bearing a character from the typeface. In digital typography, a font is typically a TTF or OTF file.


Nobody except Donald Knuth and recreational pedants bothers to make that distinction.


In British English, "font" is the thing you dunk a baby in at church, while "fount" is a set of type of one particular face and size.


Roughly speaking typeface is the design and font is the digital file. It dates back from time when fonts were made of lead in a wooden case and is ill suited to our digital world. A modern approximation of typeface is "font-family". That being said ,except for typography nerds nobody gives a s* about that, and unless you're on a design forum the two are interchangeable.


"Font" is even used by many editors, so I'm not surprised this is how people refer to it. It's also a much simpler word to say, so I understand it. They don't refer to the same thing, however. A font was originally meant to mean variations of a typeface, such as italics. But the german Wikipedia already calls it a typeface and so does the english one in the informatics context.


I'd say this has been the case for a minimum of the past 20 years, at least in the US.

Here's a screenshot from an early mac control panel. Notice the use of the word "Font" for typeface.

http://www.amd.e-technik.uni-rostock.de/ma/rs/lv/hoqt/fog_fi...


Original Mac typefaces were not scalable so you were actually selecting a font in the control panel.


I think "font" is the right term here. They talk about it being "available to download" and "available as part of a Windows pre-release", all of which make more sense when talking about the font (calibri.ttf) rather than the typeface, inasmuch as it one can meaningfully distinguish the two: I'm not aware of any other implementations of Calibri (e.g. OTF), but I'm happy to be corrected.


Ah, but .ttf stands for True TypeFace.

Doesn't it?


TrueType Font, I believe :)


It was a play on words, a comment on the ongoing confusion over "font" vs. "typeface". (Having worked on font-related software at Adobe, I'm fairly familiar with this.)

I did think it was an interesting coincidence that .ttf could be misinterpreted as True TypeFace.

Guess I should have thrown a /s in there...


TrueType Font, actually.





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: