The fallacy of absolute numbers

I often come across the following type of arguments in research papers:

  • You could save 3 bits of storage for every value in your database. Surely that’s irrelevant. Nobody cares about saving 3 bits!
  • You can sort arrays in 10 ms. Surely, that cannot be improved upon? You are already down to 10 ms and nobody cares about such small delays.

I hope you can see what is wrong with these statements?

I call it the fallacy of absolute numbers: you express a measure or a gain in absolute value, and then conclude to optimality or near optimality because the number appears small (or large).

Remember: Saving 3 bits of storage out of 6 bits is a 2:1 compression ratio. Sorting in 5 ms instead of 10 ms doubles the speed.

Note: I am sure that someone else has documented this fallacy, but I could not find any reference to it.

Lack of steady trajectories and failure

A common advice given out to young researchers is to find a niche. (See Michael’s Branding Your Research). That is certainly good advice. Instead of being another young researcher, you can be the new guy working on topic X. But it always seems to happen no matter what: most Ph.D. thesis address a narrow topic. I believe that the real advice people would like to give is: find yourself a nice topic, and make sure this topic becomes fashionable. Of course, this implies that you can somehow predict the future, or have a thesis supervisor with enough clout that he can either initiate new trends, or have inside knowledge regarding the upcoming trends.

A more interesting question is what you should do with the rest of your career, assuming you landed a research job, somehow. Should you find yourself one or two niche topics and stay there for the rest of your life? That is a common strategy. You save precious time: instead of having to skim 100 research articles a year, you may get by with 20 or 30 research articles, or even less. Moreover, because you are the leading authority on one or two topics, you can never be caught unaware. You never have to worry about finding new topics: you just keep on iteratively improving whatever you are doing right now. With some luck, you can reuse your funding proposals year after year. Finally, you can quickly get to know everyone that matters regarding these narrow topics. And that is a perfectly good strategy.

The problems begin when we associate the lack of a steady trajectory with failure. Encouraging static research topics leads to conservatism. Meanwhile, some of the most innovative researchers have cultivated varied interests. Von Neumann was a set theorist, but he wrote 20 papers in Physics, and even in Mathematics, he covered a wide range of topics (set theory, logic, topological groups, measure theory, ergodic theory, operator theory, and continuous geometry). Would we have been better off had von Neumann remained a pure set theorist?

And I tend to have more trust in researchers who have their eggs in different baskets. They can afford to be a bit more critical.

Warning: I am not urging Ph.D. students to change topic repeatedly while writing up their thesis. Finish whatever you start. And be aware that approaching a new research topic can be costly.

Academic publishing is archaic

Technological progress tends to increase the available information. Thus, our capacity to manage this information becomes overloaded (hence the term information overload). As Clay Shirky explained: it is not so much an information overload, as a filter failure. The abundance of information is never a problem. The real problem is the lack of efficient strategies to index, summarize, filter, cross-reference and archive information.

But information overload is nothing new. In Reading Strategies for Coping With Information Overload ca. 1550-1700, Blair surveys the techniques our ancestors invented to cope with the abundance of books :

  • the alphabetical index;
  • the reference book,
  • copy and paste (with actual scissors) to save time in note-taking.

What I find fascinating is the historical perspective: while still useful, the alphabetical index is hardly exciting anymore. It has been supplanted by full text search (in e-books). There are still reference books (such as dictionaries), but they are being replaced with online tools. Information overload continues to generate many inventions: the search engine (such as Google), the recommender system (as on Amazon.com), and the social networks (such as Twitter). Literally, these tools expand our minds. We become smarter.

Yet every time I finish writing a research article, I am amazed at how old fashioned the format is.

  • Research journals still ask for silly metadata such as keywords, even though most researchers rely on full text search.
  • The format is clearly meant for paper, even though most of my collaborators browse research articles on their computers.
  • We have silly things like page limitations enticing people to pack more words per page by reducing spacing.
  • It is excessively difficult to correct or improve a “published” article.

There is hope. The PLoS One journal presents research articles in an innovative format. The article is interactive: anyone can rate and comment it. Many journals allow the authors to upload supplementary material. Yet I predict that in 20 years, we will look back and think that academic publishing in 2010 was archaic. (I admit that it is not a daring prediction.) There is much room for innovation.

Source: Erik Duval.

Maximizing your impact as a researcher (guest post)

The greatest challenge for a researcher is to choose projects that have a good chance of delivering impact. Alain Désilets from NRC—co-author of VoiceGrip, Webitext and the Cross Lingual Wiki Engine—shared his strategies with me:

  • Look at how many workdays per week you can dedicate to research and make that be the number of projects you can work on in parallel. In other words, if you are one of the lucky few who can dedicate 5 days per week doing research, then you have room for 5 projects.
  • Invest your energy proportionally to the amount of positive feedback you receive for each project. This includes collaboration offers, grants, potential users, and so on.
  • Never work alone on a project for too long. It’s OK to start exploring a compelling idea on your own for a couple of months, but if you can’t convince someone else to work with you on it, maybe it’s not such a great idea after all. Maybe it’s technically infeasible, maybe there is no need or market for it, or maybe it’s just too much ahead of its time. Don’t completely give up on the idea yet. Put it on the ice for now and keep sharing that idea with people until you meet the right people to make it happen with you.
  • Instead of looking for partnership money which will require you to spend months drafting and revising agreements (who wants to deal with lawyers anyway), look for talented people who have control over their own time, and are willing to invest some of that precious resource working with you on an idea. Don’t worry about who will own the baby before it’s actually born (that usually ensures that the work relationship will never get off the ground). Just make sure everyone keeps a lab book documenting who did what so that you will have a basis to argue in a friendly and civilised manner about who owns what share of the baby, if you ever have that nice problem.
  • Talk to lots of different people from different walks of life about your idea. You never know who will give you the insight or contact you need to advance to the next level on a given project. Of course if you do this, you pretty much give up on the idea of patenting your idea.
  • Make sure you collocate in time and space as much as you can with your collaborators. There was a time when I had 5 projects (those were the happy days of 5 days of research per week), and I had scheduled things so that on Mondays I would work with Joe on project X, Tuesdays were dedicated to working on project Y with Jane, and so on.
  • Find and organisation or a type of end users with an interesting problem that you think you could solve using some bleeding edge technology. Become very intimate with the problem, maybe even pretending to do these people’s job for a day. Once you understand their problem well, don’t jump right away to the hi-tech solution. Instead, start with the Simplest Thing That Could Possibly Work, and only add complex technology where and when it is needed. This may not get you a publication in a first tier journal, but it greatly increases your odds of developing a system that will actually be used. Plus, when you DO find that you need sophisticated technology, you know exactly why, and what the actual value added is.
  • Use Agile Development practices which allow you to advance your projects in short, highly focused bursts of a few days (1-day burst are even possible). Write lots of short “stories” that describe things you can accomplish in a day or less, and keep re-prioritizing them so that the ones that currently add the most value to your target users are always at the top. Use Test Driven Development to ensure that your system is always stable and that you can put it aside for a few days or months, yet pick up right from where you left. These kinds of techniques are essential if you want to be able to quickly reallocate your effort depending on how hot your different projects are.

How do we choose research journals?

The publishing house Elsevier invited me to fill out a survey regarding their journals. As a reward, they gave me a glimpse at their statistics.

The three most important considerations when choosing a research journals are (in order) :

  1. Speed of review process
  2. Standard of reviews
  3. Overall reputation of the journal

And the activity researchers complained to most about? Peer reviewing manuscripts.

In any case, if you want to build a good journal and attract great papers, make sure you have fast and competent peer review. (Duh!) Meanwhile, having a good printer or a good editorial board are much less important.

Computer Science is shallow

Zed A. Shaw—author of several books on Ruby and Python—came up with an interesting criticism of Computer Science. He makes some good points:

Computer Science is a pointless discipline with no culture. (…) They rarely teach deep philosophy and instead would rather either teach you what some business down the street wants, or teach you their favorite pet language like LISP. (…) Another way to explain the shallowness of Computer Science is that it’s the only discipline that eschews paradox. Even mathematics has reams of unanswered questions and potential paradox in its core philosophy. (…) There’s an envelope of knowledge so vast in most other disciplines that just when you think you’ve learned it all you find something else you never knew. This is what makes them interesting.

Oh! I think there are many deep and exciting questions in Computer Science. (And not just whether P is equal to NP.) And do Sociology, Economics and History have more depth? But I agree that Computer Science is too often utilitarian. Some like to pretend that by catering to the perceived needs of industry, graduates will get better jobs. Unfortunately, too often, the students have to unlearn their so-called “practical knowledge” once they leave the campus. The honest truth: you don’t need three or four years of college to do great in the software industry.

Maybe more time should be spent on the deep questions. Here are a few discussion points that come to mind :

  • What is “meaning” and how can computation capture or codify it? What does it say about our brain? Is our brain a Turing machine?
  • Why are some programmers ten times more productive than others?
  • Can computers extend our intelligence? How intelligent can we become?

Chinese researchers publish more research papers

Funding agencies in Canada seek to emulate American funding agencies by promoting excellence. What this means in concrete terms is that few professors get most of the resources whereas the bulk of University professors are left with a pitance or nothing. The intuition behind this more competitive approach is that we must catch up with the American efficiency. We must reward the most productive researchers and stop wasting money with the unproductive ones. (Disclaimer: I am happy with the research grants I got so far. Luckily, I have been judged to be productive…)

But how is the American system holding out against the competition? I looked at the countries publishing most research papers in Computer sciences, in 1998 and then in 2008.

1998:

  1. USA (14,294 papers)
  2. Japan (2,941 papers)
  3. United Kingdom (2,706 papers)

2008:

  1. USA (15,744 papers)
  2. China (14,680 papers)
  3. United Kingdom (5,703 papers)

It appears that whereas most countries have doubled or more their production of research papers, the USA has stood still. Because these numbers are for 2008, I conjecture that right now, in 2010, Chinese researchers already publish more than their American counterparts. Of course, American authors are more cited, but the gap between China and the USA is closing in this respect as well. Interestingly, Americans also appear to be losing their edge compared to the  United Kingdom, France, Germany and Canada.

While I do not have enough evidence to conclude, I conjecture that an all-or-nothing approach, so common in the USA, may not be so efficient after all. By leaving most University professors behind, you are wasting precious resources. And I fear that by emulating this model, Canada might be losing out too.

Source: SJR.

Acceptance rate versus impact

Should you attend the most selective school? Maybe not:

Students who attended more selective colleges do not earn more than other students who were accepted and rejected by comparable schools but attended less selective colleges. (Dale and Krueger, Estimating the payoff to attending a more selective college, 1999).

Should you present papers in the conference with the lowest acceptance rate? Looking at this plot, there seems to be little correlation between acceptance rate and impact factor:

acceptance rate versus impact factor

(Source: Sylvain Hallé’s blog.)

Conclusion: The best schools or the best conferences may not be those with low acceptance rates.

Toward data-driven science

Science and business, so far, have been mostly model driven. That is, you collect a few data points, just enough to fit your model. Then you proceed from your model. However, things have changed:

old new
Manually take samples of the water in a nearby lake (4 times a year) Setup a wireless sensor in the lake (5000 samples a day)
Model an algorithm and test it once on expensive mainframe computer Build dozens of prototypes and test them on cheap laptops
Have an accountant prepare a business intelligence report, once a year See how the business is doing through your dynamic data warehouse

Hence, improving access to data is fast becoming a critical issue. In a thought-provoking post, Andre Vellino sketches the future of data Information Retrieval. Some key points:

  • Back in the early nineties, we had many electronic documents, but a comparatively poor infrastructure to share them. Then came the web and the search engines such as Google. Currently, we have many good data sets, but sharing and indexing them is painful. Clearly, we need to produce a better infrastructure for sharing data!
  • Research papers should reference data sets, by a unique identifier (such as a Digital Object Identifier), so that we can ask “What research relied on this data set?” or “Where can I find the data these authors have used?”

This is one instance where funding agencies should step in and encourage this work. It is not enough to encourage researchers to share their data. We need better tools too!

What is a good University?

Seth Godin wrote a devastating post on the future of higher education. Unlike Godin, I fail to see an imminent crash of high education. But then, I failed to predict the recent financial market crash. However, as someone who spent most of his adult life on a campus, I have an idea of what students can hope to get out of higher education :

  • Meet other smart people who come on campus to study or work (including professors). Emulation requires engaging relationships. Sometimes, you can get some of the same benefits by doing a job, but not always. Interestingly, online education almost entirely fails in this respect. However, you can reproduce some of this effect online, on your own.
  • Degrees that are key to job-related certifications. Want to be come a (medical) doctor, a lawyer or an engineer? Universities hold the keys. Interestingly, though, all these valuable certifications are supported by legal means and non-academic organizations (such as the bar or an engineering corporation).
  • University-bound financial support. Where are you going to get a (small) salary to work on proving a tough theorem, if not on a campus? Governments and donors are fond of funding universities.

Working from these benefits, how do you imagine higher education failing?

Acknowledgement: Thanks to Martin Lessard for pointing out Godin’s post to me.

The mythical reproducibility of science

David Donoho was among the first researchers to promote reproducible research through software publication (see Buckheit and Donoho, 1995). Fifteen years later, Donoho and his collaborators are even more insistent :

Scientific computation is emerging as absolutely central to the scientific method. Unfortunately, it’s error-prone and currently immature—traditional scientific publication is incapable of finding and rooting out errors in scientific computation—which must be recognized as a crisis. An important recent development and a necessary response to the crisis is reproducible computational research in which researchers publish the article along with the full computational environment that produces the results. (Donoho et al., 2009)

Their 2009 paper on reproducibility is insightful and well worth reading. I agree that sharing software is good for science, and  for scientists.

Unfortunately, I fear we might lose sight of why we must publish our software.

  1. In theory, scientists should be constantly checking each other’s results. But that is not how science is done. You are rewarded for finding something new, not for checking someone’s results. So hardly anyone will ever download your code to check whether you cheated.
  2. Reproducibility and repeatability are not the same thing. It is great that I can rerun your code. But it does not follow that your code and results are right or useful.

Share your source code to spread your ideas:

  • Keep your packages simple. People need a few key pieces of code that they can integrate in their own software.
  • Use popular languages. Remember that repeatability is not enough: people are likely to tear apart your software to reconstruct their own.
  • Go beyond academia. Why assume academic researchers are the people who matter? Spreading your ideas among engineers is important as well.

The reproducibility that matters is getting people to use your ideas. Merely proving you are honest falls short of your potential!

Further reading: Open Sourcing your software hurt your competitiveness as a researcher?

On the design of design

Following a blog post by John D. Cook, I started reading Fred Brooks‘ latest book. Brooks is famous, among other things, for his earlier book, the Mythical Man-Month. The book is really a collection of essays, organized like blog posts. It is really engaging.

I had never read about design per se, except in Paul Graham‘s essays. For me, the core message of the book is that writing software, planning houses, writing books or poems, are very similar tasks. The metaheuristics are the same. The lessons you must learn are similar. You are trying to solve very large NP-hard problems where you can’t reliably divide the problem space. Systematic greedy algorithms may work, but they may also mislead you. You need some formalism, some rigor, but you also need experience, or instinct.

What I like about my job

I’m currently a tenured professor with research grants and graduate students. Yesterday, I decided to list attributes of my job that I liked, in no particular order:

  • I have the best computer gear money can buy;
  • I spend most of my time thinking and writing;
  • I have no immediate financial worries;
  • I have a flexible schedule, I can work from my home, and I spend a lot of time with my family.

Comparatively, I didn’t like my job as a graduate student or a post-doctoral fellow even though I had most of these benefits… because I had limited and temporary income. My job as an entrepreneur had most of these attributes, but it was inherently unstable (which lead to some financial worries). My job at a government laboratory had most of these attributes as well, except that funding could be capricious.

What comes out of this analysis is that I value highly my financial well-being. As a dad with two sons, that’s hardly surprising. Also, I really enjoy working from home. Some days, I even go as far as hating my campus office: it has poor Internet connectivity, bad coffee, and so on. I’m much more prone to be distracted when I work on campus. Also, I cannot walk my son to school in the morning if I have to be at the office at 9am. So my campus office is more like a meeting place than a working place: I go there to meet with collaborators and students, not think and write.

Meanwhile, there are attributes that didn’t come up:

  • Prestige: Though prestige has its uses—mostly in getting people to listen to you—I do not value it highly. I value what I produce, but not my current status. As an aside, I own a small house, a small car and my clothes are rarely brand-new.
  • Academic freedom: People in academia always stress how much they like their freedom. It is true that I really enjoy choosing the object of my work. However, have you noticed how much professors look alike? They tend to have the same ideas and work in similar ways as a flock. Indeed, conformity is enforced even among academic researchers through funding decisions, publication decisions, peer reviews and so on. Hence, while I seek freedom in my work, I am not under the illusion that I have absolute freedom right now. In fact, I have lots of responsibilities and, often, most of my day is spent reviewing papers, marking assignments, answering questions, filling out forms, preparing talks, and so on. None of this is particularly exciting.
  • Power over others: As a professor, I could apply for large grants and then run a large research team. Or I could try to move to management. Yet, I have no interest in having power over others. I always urge graduate students and post-doctoral fellows to find their own way. I propose ideas, examples or projects, but I rarely seek to run the show. For example, instead of telling graduate students what to do, I keep asking them what they are doing. I found this question particularly powerful : “Is this really the best use of your time right now?”

Are there too many Ph.D.’s?

Would you accept work designing mass destruction weapons? Back when a was in college, one my most memorable philosophy assignment was a rebuttal to the claim that scientists working on weapons of mass destruction were responsible for the creation of the weapons. As intellectuals, and scientists, do we have to bear the full responsibility of our actions?

Somehow, we like to imagine researchers working on new weapons in lavish conditions. They are individuals with prestigious degrees who could have any job. Yet, they are enticed by greed to work on evil projects. Or are they?

I spent weeks in the library—the Web didn’t even exist!—documenting how difficult the job market for scientists was. As far back as twenty years ago, statistics showed that there were far more qualified scientists than corresponding jobs. (Ironically, years later, I became a scientist only to be surprised by how competitive the job market really is.)

Scientists may not be physically starving, but they have invested 10 or 20 years working toward a single goal: become a bona fide researcher. And jobs are scarce. Faced with reality, people compromise.

Yet, remember, this scarcity of Ph.D.-related jobs combined with a glut of Ph.D.s is not an accident:

By the fall of 1972, there are likely to be more Ph.D.’s looking for positions than there are (adequately salaried) positions with duties commensurate with Ph.D. training level in mathematics. (Anderson,  Are there too many Ph.D.’s, American Mathematical Monthly, 1970)

Source: Sébastien Paquet.

The paperless campus: still a long way to go

Today I spent money from a research grant. Here is the process:

  1. I grab the form in Excel format.
  2. I fill it out.
  3. I print the form.
  4. I sign it.
  5. I give it to a secretary.
  6. The secretary gets the chair of my research center to sign it.
  7. The form is then sent to accounting, by internal mail (on paper).
  8. They review the form.
  9. They enter the data in a computerized database.

Suggesting that it could be improved using technology from this century is crazy talk.

So, you know what’s important?

Most researchers are convinced that their current work is important. Otherwise, they wouldn’t do it. Yet, few of them work on obviously important things like curing cancer or solving world hunger. Rather, they do silly things like prove the Poincaré conjecture. A century to figure out some theoretical nonsense? Please!

So, why won’t researchers work on the important problems of our era?

The conventional explanation is that working directly on the major problems is like staring at the Sun. Instead, researchers must do routine work until an opening toward greatness opens up. So real researchers…

  • survey existing work,
  • comment on special cases,
  • provide theoretical justifications for empirical observations,
  • validate new theory experimentally, and so on.

That is, researchers are not architects. They use greedy algorithms:

  • Look at problems and results you can grasp with your current expertise,
  • Select an interesting problem which is a good fit for your expertise,
  • rinse and repeat.
  • Wait! You are close to solve a major problem? Jump on it!

Should scientists feel guilty that they can’t prove the importance of each increment? I think not. I think scientists are inefficient, but there is no better way known to man. Indeed, consider how real innovation is typically unpredictable:

  • The greatest difference between my Honda Civic and the car I drove as a teenager is that I can lock and unlock all doors with a remote. This single function made all the difference in the world for me. I drive my wife nuts as I keep playing with the remote: lock, unlock, lock, unlock… And people thought we would have flying cars!
  • I am sure that Google offered better search results that altavista. Yet, the real reason I switched to Google and never looked back is that they did away with the big annoying ads. Understanding that you didn’t need to annoy your users to make a profit was Google’s greatest innovation. (Don’t let them fool you into thinking PageRank had something to do with it.)
  • Amazon.com is by far the best e-commerce site on the planet. But what is different? In fact, a lot of small little things. On the surface, Amazon.com is just an HTML view of a database. But they have collected many small innovations that when put together make a huge difference.

My point is that innovation in the little things adds up to important and practical results. That is why academic researchers spend so much time writing surveys or studying to death a detail. They don’t think their own work will change the world, but they count on others doing the same thing. They hope that when they put it back together, the result is great. For the last two hundred years or so, they have fared extremely well.

To put it another way: greedy algorithms can be pretty good. They can certainly beat 5-year plans.

Further reading: Innovative ideas are indistinguishable from crackpot ones

External-memory shuffling in linear time?

You can sort large files while using little memory. The Unix sort tool is a widely available implementation of this idea. Files are written to disk sequentially, without random access. Thus, you can also sort variable-length records, such as lines of text.

What about shuffling? Using the Fisher-Yates algorithm also known as Knuth algorithm, you can shuffle large files while using almost no memory. But you need random access to your files. Thus it is not applicable to variable-length records. And indeed, the Unix sort command cannot shuffle. (It has a random-sort option, but it is not a shuffle. Meanwhile, the shuf command runs in RAM.)

A solution: Tag each record with a random number. Pick random numbers from a very large set so that the probability that any two lines have the same random number is small. Then use external-memory sorting. You can implement something similar as a single line in Unix.

A better solution? Shuffling is possible in linear time O(n). Sorting is a harder problem (in O(n log n)). Thus, using a sort algorithm for shuffling—as we just did—is inelegant. Can we shuffle in linear time without random access with variable-length records?

Maybe we could try something concrete? Consider this algorithm:

  • Create N temporary files, choose N large enough so that your entire set divided by N is likely to fit in RAM.
  • Assign each string to one temporary file at random.
  • Shuffle the temporary files in RAM.
  • Concatenate the temporary files.

Something similar was described by P. Sanders in Random Permutations on Distributed, External and Hierarchical Memory (Information Processing Letters, 1998). See also the earlier work by Sandelius (A simple randomization procedure, 1962) as well as Rao (Generation of random permutation of given number of elements using random sampling numbers, 1961).

Who the heck got Universities into the email business?

My current employer, UQAM, refuses to allow email forwarding. Students would rather forward their emails to their existing GMail accounts, for example. And the IT Department (the SITEL) agrees that it would have several benefits. However, they refuse to allow it for the following reasons:

  • Email forwarding may create infinite email loops. These may disrupt services and require human intervention.
  • Invalid or failing remote servers may saturate the local servers as they are unable to forward the emails.
  • Professors and management send confidential information by email. Yet, without full control of the email service, the University cannot ensure the needed confidentiality.
  • With email forwarding, it may be impossible to ensure and prove that an email was received and read. Thus, homework assignments, administrative inquiries or security advisories may never reach the students, or we may be unable to prove that they reach the students because of email forwarding.
  • As a Canadian University, email forwarding puts us at risk that the emails may transit on American servers, where the Canadian law on privacy is not applicable.
  • Email forwarding may put students at risk if remote accounts are stolen or lost.

Can you help me debunk or mitigate these arguments? I know that some of these arguments are bogus, but I am looking for solid references. (Not that I expect to change their mind.)

A larger issue: shouldn’t universities stick with research and teaching? I understand that we must have networks, cables, computers, firewalls, but do we need to provide our students with email services?

Update: Turns out that our IT people encourage students who want forwarding to GMail (say) to use the POP3 protocol. It is unclear to me how email forwarding can be a dangerous practice whereas POP3 “forwarding” can be safe.

The best software developers are great at Mathematics?

One of the upsides of working for a university are the stimulating academic discussions. Yesterday, a philosopher challenged me a question:

Beyond the fact that software is expressed in Mathematics artefacts (bits, algorithms), are Information Systems fundamentally Mathematical?

For my convenience, I temporarily rephrase the question to something simpler and more concrete:

How are Software Developers limited by their mathematical weaknesses?

I plan several blog posts around this question, but let me start with an example.

A common and powerful language to process XML is XPath. XPath is used within web applications, scripts, databases, and so on. I often ask students the following question about XPath. Are these two expressions equivalent?

$x="some string"

and

not($x!="some string").

(The symbol “!=” means “different from”.)

Invariably, most students conclude that they are equivalent. Wrong!

Let us examine the semantics.

  • The expression $x="some string" means that at least one element of $x is equal to "some string".
  • The expression $x!="some string" means that some element of $x is different from "some string".
  • The negation of $x!="some string" is that all elements of $x are equal to "some string". (Sorry if it sounds confusing.)

Thus, the expression not($x!="some string") is a  more restrictive condition than the expression $x="some string".

Great software developers routinely think through far more complex mathematical problems. Yet, they do not think of them as being Mathematics.