05 November 2009

CentOS 4 series on K6-II

Tru just pointed out http://i586.centos.org/ which is an archive of the fruit of the push to get the AMD K6-II / Intel i586 install ISO working.

Nice stuff, and nice to know the effort was not wasted...

08 October 2009

... I am the eggman

One recent addition to Python modules packaging at Red Hat in its Fedora project, is carrying along an additional, and optional structured metadata about the contents of that module (package), held outside of the RPM database

egg men

This additional information: egg-info came into Python at Python version 2.3 and following. More may be learned about these eggs at: egg-info (optional extra Python metadata about a Python module).
See: Fedora specifics

This new detail is a two edged sword. On one hand, it provides sufficient information that an ad hoc root level process, for when one using native Python tools that it makes for an easy_install [the skeptic in me suggests it might be 'easier', perhaps, along some skewed axis of metric of goodness]. See also the egg superset, Python setuptools which now work well in RPM-mediated space

two edged sword

Sadly, down this path, modules which are outside of the protections and managae-ability of the RPM packaging system may "easily" and inadvertently be introduced by an incautious admin, and thus introduce of Python code into a otherwise controlled system. This is the horror of a mixed RPM and CPAN system, all over again. As I say, one needs to choose a 'metric of goodness' with care

Incautious use of mixed packaging approaches in turn can lead to possible security and updates headaches. Using such non-packaging system tools can break the SPOT -- single point of truth -- to determine to what versions of binaries a given host is using. From sad experience, this way lies additional work, and madness

path to madness

But on the other edge of that blade, the egg-info adds descriptive narrative, and cautiously used, increases the usability of a system that does not (ab)use those native installation tools

As noted, the FOSS world has faced this problem before with perl and CPAN. Weak and strong 'includes' versioning and security model questionable @INC 'include' path search practices in Python and perl are well known failings in their community archive models. I faced it recently in a packing push of CRAN modules for R -- hmmm, I still need to file a few bugs upstream to solve some problems I saw in some R module packaging choices that I consider poor ones

poor choice

There is not a single, objectively 'technically right' way to proceed, but rather just one consistent, or not consistent with packaging system design and usage choices

Fedora helpfully offers a sample stanza to use in .spec files. I cruised my archive of .spec files to see what else turned up

# See if there's any egg-info
if [ -f %{buildroot}%{python_sitearch}/Conch*.egg-info ]; then
echo %{buildroot}%{python_sitearch}/Conch*.egg-info |
sed -e "s|^%{buildroot}||"
fi > egg-info

and later then using the %files stanza's -f file list option

%files -f egg-info
time pressure

It appears I need to do some work in my local archive of SRPMs. Never enough hours in the day

02 September 2009

Like a stake through the heart

The CentOS 4 series point refresh has been released to the mirrors for a couple weeks now, and the updates it backlogged as well. But the AMD K6-II / Intel i586 install ISO was not right when we shipped, and we knew it

stake through the heart

Akemi 'toracat' Yagi had it working in her side archive, and kept working the issue with Johnny 'hughesjr' Hughes, and candidate ISOs have been in testing in the QA back channel. I get a 'heads up' on a new testing from hughesjr yesterday afternoon, and around 5 am today, a notice that a new candidate was ready for pulling and testing

I put lftp to work, and burned the CD. Booted with the command line parameter:

: i586 text

and did a minimal install

Eureka -- it works in mainline CentOS

Coming soon to a mirror near you (for the four or five users of such old kit). The unit I am testing on was my workstation on 11 September 2001, and I long since consigned it to the boneyard

090902: fixed grammatical error

11 August 2009

A bit more on CentOS 4.8 and the K6-II

Yesterday's post on the K6 covered getting a CentOS 4.8 beta candidate installed on ancient hardware; The careful reader may have noticed that I had an unexplained list item early on in that outline:

Add to /etc/yum.conf

This is not something that just occurred to me unbidden, but rather came from an awareness that the upstream has had the dreaded 'Regression' from time to time in its RHEL 4 series, where a patch needed to support the K6/i586 architecture was not consistently present. In reading the bug comment notes, it seems that the 'boneyard' available to the member of the kernel testing team tasked with this is not so full of carcasses as mine, and so he cannot test his fixes as well

So, I took affirmative steps to preemptively 'partition away' the need for an updated working kernel from our 4.8 beta install candidate, and yet be able to get to a working chassis with the kernel from the 4.5 final image, which is known to work. Good thing. The regression is back in the 4.8 kernel SRPMs, and the needed patch got dropped, it seems (this from an initial workup -- detail testing will be needed to see)

The workaround is straightforward; Akemi 'toracat' Yagi maintains a testing 'plus' archive, containing kernels with the needed patch, and I can confirm that her candidate works fine. see: http://centos.toracat.org/kernel/centos4/centosplus-testing/i386/

Thanks, toracat

Advancement of technical skills with CentOS project tools

I posted this piece inside a post on a runaway mailing list thread on the CentOS mailing list. It represents my opinions, and are not some policy statement of the CentOS project. To a degree it reprised earlier pieces on how to advance one's technical skills with CentOS, but it is worthwhile carving it out, so I have a reference point to discuss sub-pieces of, here. Others have other views

If a person wishes to be advanced in the CentOS project, contribute to the project. [It is not clear to me WHY people think there is some huge benefit for being a 'project insider' as it is really just a chance to do more work. Early access to QA is just not that hard to earn] We are not likely to hold your hand much, but will answer questions well framed. Be a self starter. Do something material. Some things to do to gain my notice as a contributor of merit:

  1. The bug tracker is open self serve for people to sign up. Add its RSS feed, and read every one as it crosses. Start working through the bugs to replicate or note an inability to replicate issues; Work through the bug tracker from latest to earliest, seeing if there is a similar upstream bug, or a fix, or if an issue is CentOS local. Note your results. That would be useful
  2. The centos-docs ML is open for proposals of new content into the wiki. Add its RSS feed, and read every commit diff as it crosses. Fix broken stuff that can be fixed at once. Some even believe it is more useful to re-write documentation locally rather than feeding improvements upstream so that it flows back down and out into RHEL, Fedora, etc as well as just CentOS [I do not, and refer you to Fedora to push non-centOS specific content out more widerly]
  3. Set up a local mirror of SRPMs, not just of the released Enterprise sources of upstream, but its RawHide as well. I have a daily diff report in my email queue each morning to scan for new material to review. Start building and testing and filing bugs to make the .spec files more general and less distribution specific, so that cross pollination can occur. You may get rejected (I often am), but at least try to improve the breed
  4. The same problems repeat time and again in the Forums. Add its RSS feed, and read every new post as it crosses. Add pointers or content as needed, and 'cc' into updates on the thread. I have noticed a excellent trend, that lately the three or four regulars are moving content more to the correct tree location, and asking questioners to do their research, and dropping out-links to answers rather than doing so in line. I like to do this as well when I form an answer, there on on a mailing list that is archived, as it provides the linkage hints Google needs to note 'reputation' and to weave answers together
  5. Join the main IRC channel or mailing list, and confirm you can answer every question posed for a solid week; if not, fill in your knowledge gaps with experimentation. At that point, start thoughtfully pointing a person toward the answers. Spoon-feeding is NOT a good thing, and does not gain any points in my eyes, as that is not the stated purpose of the channel

    The mailing list is looser as to /on topic/ but when a person repeatedly recommends 'non-CentOS' approaches over acceptable CentOS product, I'll certainly notice ... and that is perhaps not a good thing for further advancement. I _USE_ tinydns some places where it is the right fit, but I don't mention it here
  6. Once you have demonstrated skills, ask to be admitted to the next QA effort (we get three of four point update chances a year), and do QA. People who sign up and are admitted often slack off [don't participate in the ML, don't file reports, are not in IRC], and by that inaction demonstrate they are are not interested in progressing further. People _do_ get busy with real life or have to rest from burnout and take time off
  7. Once you have demonstrated skills, ask for some special project to build some element of needed infrastructure that is not otherwise getting done, and do it. John Pierce's post earlier this week certainly caught my eye, as he demonstrated self-starter problem solving skills in a complex space I had not seen before. He is now on my 'watch list' to draw into the project

More personal opinion: Will any of those 'earn' a centos.org mailing address as someone lamented they lacked earlier in this thread? Sometimes, but frankly, we don't give those out easily. I saw a remark earlier:

In the meanwhile some things ... are getting a bit clearer so I guess we are on the right track

'We' can perhaps be read here as a generic 'things are on the right track' -- but frankly, the only 'we' that I would look to for authoritative statements as to the project are people with a '@centos.org' in their email address. There is back channel coordination, infrastructure, and much more

10 August 2009

Beta testing CentOS 4.8 with an AMD K6-II

Painful does not begin to describe how laborious it seems, after using more modern kit.

It appears that the AMD K6-II instruction set is a superset of that used on the i586 series. Some folks seem to be still running such, and we have a number of resolved bugs in the tracker, detailing various ways to get the units running

Based upon exhortation and advice in the CentOS QA mailing list and some IRC banter, I was induced to drag one of these poor exhausted clunkers out of my boneyard, and do some testing on it

These installation instructions SHOULD work on i586 as well, but I no longer have an examplar to confirm with:

  1. Download and install using 4.5 i386 ISO from vault.centos.org and start it up the following options

    Boot it with: i586 text nomce

  2. Manually install openssh-server, enable, and set up with iptables, so you can hop on the unit from a remote box to work on it
  3. Add to /etc/yum.conf

  4. Perform a general run updates against the intervening changes prior to 4.8 -- (seemingly 4.7 and intervening updates when I perform this testing) -- lots there, but get it close to current.

    Install 6 Package(s)
    Update 150 Package(s)
    Remove 0 Package(s)

    ... took forever as I only have 128k ram for this old beast --- 308 transaction steps
  5. Do an interim reboot
  6. Point at my local mirror of the CentOS 4.8 release test candidate and let it rip --

    first pass only:

    without the later pending updates:

    Install 1 Package(s)
    Update 83 Package(s)
    Remove 0 Package(s)
    Total download size: 117 M
  7. Do a second interim reboot

    Mysteriously, I got an 'unclean shutdown' FSCK required message as to /boot here ... no idea why
  8. Run yum again, for a second pass with the updates

    Install 0 Package(s)
    Update 9 Package(s)
    Remove 0 Package(s)
    Total download size: 9.8 M

  9. Do a final interim reboot
  10. I completed by my test suite without incident

I am advised similar steps may work from later than a CentOS 4.5 ISO, and that i586 should work as well. As I lack the hardware to test this, your mileage may vary

Poor old boxes. Let them rest. Save power. I need a shower. Yuck

03 August 2009

Life in the Fast Lane

Cobblestone cat with beer
Slow down, you move too fast.
You got to make the morning last.
Just kickin' down the cobble stones.
Looking for fun and feelin' groovy.

  -- Simon and Garfunkel

I picked up my wife at the airport late Wednesday night as she returned from a trade association conference related to her job. As we drove home, she talked about the unusual that happened there. It seems 'New Media' and 'Social Networking' tools have popped up on their radar, but her peers are wrestling to understand the motivations, and how to participate. It seems she astonished them, describing the FOSS stories and tools used that I 'bring home' as I recount the day at the dinner table: websites, wiki, mailing lists public and private, user group meetings, IRC, blogging, Twitter, VOIP, and so forth. They were 'wowed' that an old guy like me had used Twitter and a quick Google tour into the Wikipedia, to answer a son's question raised by an Admiral in a meeting at his job consulting for the federal government in metro DC a while back in seconds of a question coming up

I suppose I take the pervasive availability of such tools, which largely are implemented through a foundation on the fruit of the 'Software Libre' movement for granted, and live a comfortable existence in this virtual reality. Although my hair has been gray for a couple of decades, it is not the me of my self-image, where I still feel 25 and full of vigor. That I whistle, and know the words of pop tunes from 40 years ago and play word games on the tunes at the coffee shop with the barrista does not jar me, although if I get a young one, they clearly have no idea what I am riffing on

That to one side, I still revel in the wonder of the tangible world; a world of taking the family to the State Fair, or working with my hands, wood, and tools repairing a grandchild's wagon. I wrote the first draft of this piece -- a blog post -- with pen and paper with no plan on my mind beyond reconnecting with myself after a hard week, not just the CentOS matters, but in my local physical world as well; I should perhaps rather say, this piece wrote itself, flowing out of my hand's motions, creating, and editing on the paper before me, with strike-through's, insertions, and circled blocks of test indicating movement of thoughts into a flow

rat race with ice cream
I got no deeds to do, no promises to keep.
I'm dappled and drowsy and ready to sleep
Let the morningtime drop all its petals on me
Life, I love you, all is groovy!

Thinking back as to how I write, I sorely miss the older times of a ready steno-typist, secretary trained in shorthand, and later a ready 'Dictaphone', and the 'gal Friday' legal secretary who helped organize my worklife for many years. I did the creative work, and she straw-bossed the rest behind the scene, as I turned to face the next 'fire'. Each hard work in its own right, and a great and productive partnership. She's dead of lung cancer now -- was a smoker. Ah well

The economics of such luxuries are prohibitive to most in an era where a person who cannot touch-type is perhaps now considered not yet fully literate. Welcome to the next lap of the rat race in this brave new world

When the positions of transcriptionist, book-keeper, and sales clerk, along with the others mentioned above disappeared, and 'progress' came to the smaller enterprise, they were replaced by the small individual computer, word-processing, Quick Books, etc. Oh, and a subtle transfer to self-service responsibility to do all the work with less facility for delegation. Layers of support costs disappeared, as did the middle management, as entities had to flatten the organizational chart, or be outraced by their competitor

Of course, the workload did not go away, any more than a completely 'paperless office' has emerged, The load shifted up to what were formerly more 'knowledge work' folks -- supervisors, or in a small enough firm, the entrepreneur owner, or just was no longer done ... sometimes the customer is 'drafted' to scan bar-codes and pay a cold machine, and no human hand on the part of the vendor can be found. Just try to find a phone number for eBay or Amazon live support some time

commo antennas

We as a culture have weakened and removed spare resource capacity needed to build and nurture long term repeat customers, in favor of cost efficient transactionalism. Gresham's Law, all over again

Ba da da da da da da ba bap a dee...

During the week I too must prioritize, and work away at the hottest items in Covey's Quadrant One, as my schedule dictates them to me, Less important dreams and promises, desires and goals are left for an open dated 'later.' In my heart of hearts, however, I know that later will never come. Those 'heart's desire' are left behind on the horizon of each new day, for dead

I can offer no remedy, save a caution that when building that schedule, to not mistake a capability to act immediately, with a mandate to do soRushing into the future

Where does the answer lie?
Living from day to day
If it's something we can't buy
There must be another way

We are spirits in the material world

  -- The Police, Sting

06 Aug 2009: edited for a typo/grammar fix, layout error

30 July 2009

sadly, an Open Letter to Lance Davis

Open Letter to Lance Davis

July 30, 2009 04:39 UTC

This is an Open Letter to Lance Davis from fellow CentOS Developers It is regrettable that we are forced to send this letter but we are left with no other options. For some time now we have been attempting to resolve these problems:

You seem to have crawled into a hole ... and this is not acceptable.

You have long promised a statement of CentOS project funds; to this date this has not appeared.

You hold sole control of the centos.org domain with no deputy; this is not proper.

You have, it seems, sole 'Founders' rights in the IRC channels with no deputy ; this is not proper.

When I (Russ) try to call the phone numbers for UK Linux, and for you individually, I get a telco intercept 'Lines are temporarily busy' for the last two weeks. Finally yesterday, a voicemail in your voice picked up, and I left a message urgently requesting a reply. Karanbir also reports calling and leaving messages without your reply.

Please do not kill CentOS through your fear of shared management of the project.

Clearly the project dies if all the developers walk away.

Please contact me, or any other signer of this letter at once, to arrange for the required information to keep the project alive at the 'centos.org' domain.

Russ Herrold
Ralph Angenendt
Karanbir Singh
Jim Perrin
Donavan Nelson
Tim Verhoeven
Tru Huynh
Johnny Hughes

08 June 2009

Phat pipes

Check the top row, right entry ... peaking at 44 megaBytes per second, and a lesser rate sustained over 8 hours; all relevant filtering bridges, and servers in the transfer at our end are running ... CentOS

We've spent the last couple of months in the buildout of our (new) presence in the North data center. We have sites in the central city, on the Dublin fiber ring, and through the north end AT&T switching center, but each has had its faults over time. The downtown 'carrier hotel' was offline for four hours due to a lack of redundancy in its generators during last September's multi day power outage; the Dublin fiber ring peering exchange point had issues as well, but longer; our multi-site strategy saved the day as none of our customers lost inbound data nor went dark in their web presence; uplinks were not affected as we handle them over different routes. In the last couple weeks, AT&T's congestion issues have re-appeared at their plant as well when we were 'babysitting' a large CAD/FEA file transfer ... again multi-gig

The new data center is pricey -- but in addition to the care at the physical layer, it is BGP multi-homed and has really fat pipes. The screenshot up top shows the inbound consumption on the green. Iniitally we had a hard cap on our switch to limit it to 10 MegaBytes/Sec inbound -- but we were doing a large (a multi hundred gigabyte pull), and dropped the cap once it was clear all was working well

We are in the paperwork phase at the moment with ARIN, to clear up some 'lint' on our ASN, but with any luck by the end of the month, we'll have completed the cutover

11 May 2009

Rainy Days & Mondays

Karen Carpenter made the song famous for its authors, but clearly none of them were sysadmins

The rule, long known, for sysadmins is:

Never make a major change on a Friday, nor before leaving for vacation

I've been wrestling with the fallout from a violation of the sysadmin's rule by an upstream provider -- the vendor pushed in some change on Friday in the preparation of CDR -- Call Detail Records. For four days running, my sub-processes which manage the account have been failing for want of data. Those processes retrieve and apply CDR data, to emit accounting detail for customers, and have not been working

I've filed five or six sub-issue tickets which that primary change exposed, in trying to get the matter resolved: The current Firefox cannot open tickets under the current Windows XP, current SP [no problems with CentOS and FireFox or konqueror]; my 'closed' tickets were not visible; tickets were being closed by upstream before I confirmed a fix worked, so I ended up essentially re-opening the same ticket three times as each day's CDR pull failed; I was not receiving email updates of tickets; and so on. project managerI am quite sure they consider me a 'stickler for details' and something of a pedantic pest at the moment, but dammit, I'm paying their bills. The PHB supervisor may want tickets closed quickly; but I want my issues fixed first

... as no one likes to be called into work on the weekend to revert a change, the sysadmin's rule must be faithfully applied

Getting to a x86_64 build environment

In the #centos IRC channel on freenode, today, a new user was trying to clean out the 'multi-lib' artifacts in his build environment, so that it was only generating 'x86_64' results

Tom mentioned:

10:26 Zathrus> realistically, just removing glibc.i?86 should nuke everything else...

and so I fired up a victim xen instance to test that hypothesis

sudo su -
cd /etc/xen
cp centos-5-x86_64-test centos-5-x86_64-victim
joe centos-5-x86_64-victim
# the edit is to rename the instance name, and the image to be used
cd /var/lib/xen
cp centos-5-x86_64-test.img centos-5-x86_64-victim.img
xm create centos-5-x86_64-victim
virt-viewer centos-5-x86_64-victim

Then inside the instance as root, I ran:

rpm -qa --qf '%{name} \t %{arch} \n' | sort > pre-remove.txt
yum remove glibc.i?86
rpm -qa --qf '%{name} \t %{arch} \n' | sort > post-remove.txt
grep -v x86 post-remove.txt | grep -v noarch

getting the result:

gpg-pubkey (none)
libaio i386
libgcc i386
python-devel i386

That's a pretty good result for a first pass, and a quick hack. I think I'll go down for some coffee, and think about it a bit morecoffee mug

05 May 2009

Revenge of the Jedi (Part II)

I wrote last week from memory about the use by the Basel II standard of a method, additively combining non-linear and correlated risk events

I see a post on the R-SIG-Fin mailing list from conference organizer Jeff Ryan, that the presentations from R/Finance 2009 are up

Page 4 on the PDF of the Klaus Rheinberger, et al. presentation nicely states the executive summary that this is 'problematic'. The work then shows a worked example

Let's call Basel II what it is -- a top down pronouncement on meaningless rules, written in a fashion that is willfully ignorant of the lessons from the US S & L 'hot money' and actuarially un-sound deposit insurance debacle 20 years ago, of LTCM as to correlated risks and 'being the market', and of the recent Credit Default Swap insurance blowup of AIG

01 May 2009

May Day celebration

Renegade People's Movement -- our leader

I see a tweet from KB about the monthly mailman mailing list reminder emails; I took steps long ago to use procmail to watch for these, and re-mark their subject line. I then sort my mailspool by subject in alpine, and delete this noise all in one pass.

# mailing list memberships reminder
:0 fw
* ^Subject: \/.*mailing list memberships reminder
  | formail -i "Subject: mlmr] $MATCH"     \
    -A "X-Reminder:$MATCH"    \
    -A "X-Munge: moved mailing list memberships reminder"

All power to the people

27 April 2009

Revenge of the Jedi

"The pen is mightier than the sword"
I posted a bit earlier today about the forgotten religion of Monetarism in the context of my weekend at a conference in Chicago. I had not heard mention of the faith, nor seen anyone but myself doing analysis using the old tools for a long, long time

I blogged a bit back about Jim Chanos' critique on CNBC of the new Mark to Market 'requirements' and the artificiality of the Basel II reserve requirement target value, Chanos suggesting a relaxation to a transition value of say 1.5 percent to 'conform' to Basel II in the short term. Two weeks ago, I had mentioned 'A Monetary History of the United States' (Friedman, Schwartz) to a friend wanting to understand how we got where we are; yesterday evening, I was discussing the Nixonian repudiation of Bretton Woods and the need to revisit Basel II, as I saw a clear demonstration that Basel II is defective at the conference. I do not have my notes at hand, the slide decks are not up yet, but I believe it was in the mixed currency risk analysis (Austrian, Swiss, and back to Central Europe as to residential property loans), which I believe Rheinberger gave entitled: 'VEC and GVAR Models using R' which exposed quite clearly that the 'experts' are using simple additive risk summing in Basel II, seemingly oblivious of the concept of the non-linear nature of correlated risks

The afternoon's email brings a report that Anna Schwartz is still out there as well

The old craft will live so long as a single practitioner remembers them

"Hokey religions and ancient weapons are no match for a good blaster at your side, kid."
The good fight continues; I'll keep swinging with the tools I know, thanks

p.s.: I do know the regular titles for the third and sixth released Star Wars films

'R' you experienced?

" ... To confer, converse, and otherwise hob-nob with my brother wizards ..."

I spent a productive weekend up in Chicago, at the R/Finance 2009: Applied Finance with R conference, which billed itself as the "first annual R/Finance conference for applied finance using R". The conference organizers and hosts are the 'usual suspects' on the 'R sig fin' mailing list; Jeffrey Ryan, Dirk Eddelbuettel, Dale Rosenthal, Brian Peterson, Peter Carl, Gib Bassett, and John Miller, assisted by the talented and imperturbable Holly Griffin of UIC. Pretty clearly most of this group code together regularly; see the committer list on the blotter module. The venue was at the 'other school' in Chicago, the one with a more practical interest in Economics and Finance

An aside about Chicago: Long ago, and far away, I was trained as a acolyte 'monetarist' by disciples of Herb Stein's CEA and the Fed, in the [University of] Chicago school [a fad, seemingly long forgotten by recent Economics and Finance grads, so far as I can tell]. Monetarism is a forgotten religion these days; the Fed stopped formally publishing its M3 series a few years ago, in light of the rise of what Bill Gross calls the 'shadow banking' system.

Luke: [The robot] claims to be the property of an Obi-Wan Kenobi. Is he a relative of yours? Do you know what he's talking about?
Obi-Wan: Obi-Wan Kenobi. Obi-Wan... Now, that's a name I've not heard in a long time. A long time.
Luke: I think my uncle knows him. He said he was dead.
Obi-Wan: Oh, he's not dead... Not yet.
Luke: You know him?
Obi-Wan: But of course I know him

Many of the organizers were known to me from my email correspondence or from observing their packages, and I had spoken with one (Dirk) two or three years ago briefly after the trading shim was first usable

Dirk seems to be a bundle of unbounded energy. His tools had solved a lot of data storage and visualization issues for me early on in our project. He led a push a couple years back to drill in many of the R add-on modules into the main Debian archives. I still hope to emulate his example in rpm space using R2spec and some post-processing scripts (Dependency enumeration is not quite perfect yet). Dirk has already mentioned in his blog the 'after-sessions' at Jak's; we closed the place down Saturday with useful brainstorming happening long into the night

I had resolved to travel to the conference to learn, and to stay quiet as to matters of FOSS and advocacy. I was almost even able to keep to that intent, save at the interstitial times. The formal presentations were amazing in their quality of content, competence of the presenters, and challenging to my old knowledge of Statistics and Mathematics. I even understood most of what the presenters were doing, and why, on the formal finance side and will re-read the slide decks with great interest when they appear to fill in the holes. One would have probably had to be there to draw much more from the decks, as the presenters were not doing the occasional 'stand and read' presentation one finds at some conferences, but rather largely used their decks as reminders of the points they wanted to hit and elaborate on in their presentation, and to state exactly the code and formulae in play. The committee have really set a high bar to reach for next year's event to top, and I look forward to it already

The pre-conference tutorials were worthwhile. I knew Jeff Ryan's work from xts and IBrokers of course, and gained insight into his mental roadmap on where the code is going and how it will get there. I think the enhancements he is trialling in xts will pretty clearly flow back upstream into zoo in general form; I had not heard of Dale before, but his breakout and presentation of an analytic approach on addition and testing of single constraints (I have covered scientific method and epistemology here before, and will again) served as a fine warm-up to the formal sessions

During the session breaks, at meals, and into the night, I had a chance for give and take at length with several of the committee, presenters, and attendees, to bridge what Patrick Burns spoke on -- the chasm between Practice and Theory

Part of the trip and my need for listen, was to get a handle on how to match the shim and R as a pair of heavy-weight co-processes, so that the user of the shim can hook in and use the wonderful tools already in R space. We'll most likely get there, but the timing is not clear. Having said 'we', permit me to make it clear that the heavy lifting will be done, if and when done, by Bill, and not me. At the prodding of Peter, who as I understand it regularly team codes with Brian, I have started the sign-up process for an account at r-forge, and will 'cut my teeth' on a simple connector or module, to warm up my skills as a co-development tester and 'guinea pig' consumer of the major task of integrating the shim

FIX rules the roost for being the 'lingua Franca' for interchange to exchange order, position and fill data with counter-party upstream brokers or exchanges (thanks here to the CME Foundation for partially funding the event). We will not soon be adding a compressed FIX connector to the shim, and certainly not before we attain our major milestone of a formal 'complete' first release.

Finally, a couple folks asked why we were playing down in retail space with the TWS and its vendor specific API. For a researcher, and for a small proprietary trader, we still find IB's API and services the most affordable, and substantially complete. It is a gateway to enable any interested researcher to do material research (the 'Theory') and strategy development and execution (the 'Practice'). For student academics, the availability of IB's 'trading Olympiad' program and the shim, and R offer all one needs for a better than free price

Update: We see also the summary of the event at Revolution Computing, an R vendor

16 April 2009

Afraid of experimentation

The #centos IRC channel at irc.freenode.net never ceases to amaze me. We get questions that would take at least 30 seconds of reading a man page and experimention to answer, asked over and over again.

Here is one of the latest:

16:14 clueless> I installed a Windows Vista Business x64 VM on 5.2. Is it possible to get hibernate/sleep to work?
16:15 clueless> I want only want the VM up occasionally and I'd rather not wait for a full boot every time

Firing up a xen virtual machine in a root panel, and popping open another to read the xm man page, I find:

# cd /etc/xen
# ls
# xm create win-2000pro
Using config file "./win-2000pro".
Started domain win-2000pro
# virt-viewer win-2000pro

and a Windows 2000 session appears. I let it boot to the login prompt, and then:

# cd /var/lib/xen
# xm list
# xm pause win-2000pro
# xm save win-2000pro win-2000pro-save.img

Which of course as the man page promises, terminates the running image. Then:

# xm restore win-2000pro-save.img
# xm list
# xm unpause win-2000pro
# virt-viewer win-2000pro

And we are right were we left off, at the initial log in prompt.

Is it so hard to at least pretend to look first?

01 April 2009

I propose that women have 28 teeth

teeth to count
Why have men more teeth than women?
By reason of the abundance of heat and blood which is more in men than in women.
  -- "Of the Teeth.", Aristotle

One of the mysteries behind the quote above, was why Aristotle did not simply find a near-by woman, and ask her to permit him to count her teeth

How do we know what we 'know' to be true? The difference here is of course that between 'deductive' and 'inductive' analysis

Political 'debate' and flame wars on which Linux distribution (package manager, editor, MTA, and so on ad infinitum) is better, often degenerate to deductive reasoning from a firmly held (perhaps from ideological basis, perhaps from prior experience) 'Theory'. Then one is to state a testable 'Hypothesis', and actually perform field or experimental 'Observation' to validate or disprove that hypothesis, and finally, reaching a 'Conclusion' that the Theory is supported or not. Aristotle omitted the critical stages of testing his hypothesis, and so fell into error with his assertion. Pure reason lead him astray

It is just as easy to fall into error from the inductive reasoning side. I have noted for many years now that in early February, I see newspaper reports that the groundhog ("Punxsutawney Phil") is reported as seeing his shadow (consider the hints from the Bill Murray movies, 'Caddyshack' and 'Groundhog Day'). That he sees his shadow seems to cause Winter to continue for six weeks or so

The cardinal birds also must read the newspaper and observe the shadow sighting report in timing their return to north of the Mason-Dixon Line. When the timing is right, the cardinals return to my town. It takes a week or two, but once the cardinals have reported back to the southern over-wintering havens, the robins follow them

The return of the cardinals also cause the forsythia bush out back to bloom (I suspect there is some needed chemical agent in the bird droppings). This is important because it needs to snow on the forsythia three times before it is safe to plant the vegetable garden to avoid the seedlings being frozen and killed

My chain of 'Observation' is most careful, taken over many years. A 'Pattern' emerged that I could see, and so I formed a 'Hypothesis' as to what was occurring. My 'Theory' seems to explain nature well. The 'inductive' results are of course completely wrong, untestable, and confuses co-incidence (sequentially timed events) with causation

The XKCD website has this:
and if you are not reading that site regularly, you should be. We'll be using statistics soon enough here

At the end of all the back and forth about deductive and inductive methods, we have to end up at the conclusion that pure logic is but an organized way of committing error. Nothing can replace putting forth a testable hypothesis, and getting down and dirty in the data testing it to confirmation or refutation

Critical note. — Of a piece with the absurd pedagogical demand for so-called constructive criticism is the doctrine that an iconoclast is a hollow and evil fellow unless he can prove his case. Why, indeed, should he prove it? Is he judge, jury, prosecuting officer, hangman? He proves enough, indeed, when he proves by his blasphemy that this or that idol is defectively convincing — that at least one visitor to the shrine is left full of doubts. The fact is enormously significant; it indicates that instinct has somehow risen superior to the shallowness of logic, the refuge of fools. The pedant and the priest have always been the most expert of logicians — and the most diligent disseminators of nonsense and worse. The liberation of the human mind has never been furthered by such learned dunderheads; it has been furthered by gay fellows who heaved dead cats into sanctuaries and then went roistering down the highways of the world, proving to all men that doubt, after all, was safe — that the god in the sanctuary was finite in his power, and hence a fraud. One horse-laugh is worth ten thousand syllogisms. It is not only more effective; it is also vastly more intelligent.
  — The American Mercury. p. 75., Henry Louis Mencken (1880-1956)

broken idol
But then you get a lot of angry letters, from those whose clay idol you have smashed

edit: two typo fixes

31 March 2009

Who is in charge, here?


"When I use a word," Humpty Dumpty said, in a rather scornful tone, "it means just what I choose it to mean - neither more nor less."
"The question is," said Alice, "whether you can make words mean so many different things."
"The question is," said Humpty Dumpty, "which is to be master - that's all."
   -- Through the Looking Glass

A U S Supreme Court memorandum order (called a 'Slip Opinion' here) today said:

The writ of certiorari is dismissed as improvidently
It is so ordered.


Some Latin in there. 'PER CURIAM' is: By the Court as a complete panel and entity, without specific attribution of the action to any particular Justice. A 'writ of certiorari' is: a publicly stated intent of the Court to receive a case for presentation and argument, and possibly (usually) decision.

That a matter is characterized as: "improvidently granted" is not Latin, but is a 'term of art' -- basically: It turns out we (as a decision making body) cannot, will not, or should decide after, and we decline to consider the particular aspect of the case we initially though we should hear for the present.

It happens -- maybe with one member of the Court ill, the Court decided it needed to lighten its load; perhaps some conflict came to light in the investment portfolio of some Justices that the remaining (non-recusing) panel members of the Court felt now that fairly they could not hear and decide the matter, to avoid the appearance of making a biased, short handed, or improper decision.

The functional effect of such a terse statement is to leave intact and in effect the next prior lower court's ruling

While a lawyer's craft is well depicted in television crime procedurals, the more cerebral parts of avoiding issues which make one interesting for television, are decidedly more valuable services a lawyer provides to the general society, when advising a client. A lawyer's opinion can help the client see the areas to avoid; how to structure its affairs. The Court speaks in a completely understood fashion here, to communicate just what it intends to say, and nothing else

The recent 'mob rule' in Congress of vilifying and proposing to punishingly tax the folks at AIG who had clear contracts for payment of a 'retention' and performance bonus, in exchange for staying on as the ship of AIG seemed to be sinking (and indeed staying and doing their contractual duty) show how fragile civilization is. All the hollow words about 'fairness' and populist anger cannot mask the fact that Congress successfully pointed the finger of blame away from itself and the government, and toward a tiny minority of our society. If a Constitution, contracts, and rule of law are so readily cast aside, no one is safe. Paris 1789 and following, all over again. Whom shall the mob turn to next?


The use of words is how we explain to another, and sometimes to ourselves, what we are thinking; why we believe what we believe; permit us to reflect and find weak points in our thought processes. Structured words -- Court decisions, Constitutions, laws, opinion letters from lawyers are part of an ongoing societal dialog

We are all diminished when cheap talk trumps reason

OMG, Round Two

ring girl, round two

I wrote a bit back about a gratuitous change in Red Hat's RPM variant breaking backward SRPM readability, in a fashion which stranded users of the earlier Red Hat Enteprise release products (and rebuilds such as CentOS) away from the Raw Hide pool of developmental edge packages.Mothra attacks

The fix I outlined: to build and freeze (in time, and against updates) a RawHide domU instance, and to use that domU and an NFS mount back into the earlier dom0 to unpack SRPMs. This works fine for the present.

The full size screen shot is a bit large, but down that link. It takes just a couple of seconds to set up a new unpacking destination, and to do the rebuild, once it is set up.moving SRPMs from RawHide

27 March 2009

Promoting ignorance

Schultz knows nothingThere is a good reason lawyers should not give, and are really uncomfortable having a client publicly discuss advice they have given in a public forum

This crossed a mailing list today:

Subject: fedora-d-rh] Re: question about patent

Without reading or looking at the patent at all, it is almost always really bad to discuss patents in public, especially on email.

Patents & patent trolls are so pervasive that you can help feed patent trolls by bringing up the possibility of infringement in these forums (even when they are marginal claims).

I have always been given guidance that engineers should never, ever do patent searches and never discuss the specifics of IP issues in email.

Amazing takeaway. The poster missed the obvious extension that really NOTHING in the way of litigation awareness and preparations should be discussed

A quick Google search using: willful ignorance of a patent yields this in a pull quote:

Courts have used terms such as *intentional blindness,[15] *blind disregard of the peril it faced[16] and *willful ignorance[17] to describe the accused infringer who did not conduct a search prior to adopting a mark

[later] ... With the ease of accessing information, it is likely that courts will increasingly find that an accused infringer's failure to conduct an appropriate search before adopting its proposed mark is a clear indicator of bad faith.

The article's author 'threads the needle' nicely, between providing general information, and not giving express advice. But he DOES assume the reader recipient will CONSIDER the implications of what is being said. Silly lawyercamel in the eye of a needle

Down at the bottom of that information article, we find:

The information contained in this alert is provided for informational purposes only and does not represent legal advice. Neither the APLF nor the author intends to create an attorney client relationship by providing this information to you through this message.

Time to stretch the legs, and walk down to Stauff's for a coffee

26 March 2009

IPv6 eats kittens (and distcc) on Debian Testing

Flikr domo and kitten

This can only end badly

I spent a good 5 hours this week, tracking down a problem with distcc hanging up in our Debian Testing build farm. We use distcc to speed up compilation of the c++ sources in the development of the trading shim. Interestingly, our end user community forced us to this decision of developing on Debian testing, as they are using later gcc versions than we were on CentOS, and it was useful to be able to see their errors, BEFORE they reported them to us

On the new compile farm, sometimes we would get a compile in, say, 44 seconds; other times it would drag out for several minutes. This is a problem as we had just slotted a new unit into harness, and expected better results

In checking the logs in the client doing the distribution of compilation tasks, we were seeing a symptom of 'segfaults' in that client's process; other times, the client would stall, seemingly blocked waiting for a compilation result to come back from a remote buildfarm peer, that never came back. Checking on the remote build unit, one of the distccd children would die for mysterious reasons, leaving a message in the dmesg record. Once that failed build timed out, the needed file would be built locally, and the build proceed. Checking the log files nothing obvious jumped out

The obvious debugging technique is to get a minimal reproducer, and then to partition the problem into smaller and smaller possible causes using that reproducer tool. the issue will manifest on one setup, but not the other, ans so one can rule out more and more issues, until the answer is left, staring you in the face

Looking at my Debian helper tool, it had rotted, and was in sorry need of removal of some constraints: It did not use distcc when available; it did not use proper -J parallel compiles; it did not use -O3 optimization in the compiles. My test tool was not set up to see what I needed to see

Time to pay down some 'technical debt' (If you've not read martinfowler piece, and viewed Ward Cunningham's video, stop now, and do so). And so I made some payment there. After testing, I got these results:

MasterClientsElapsed time (real)
 pippin  nfs2,  0m23.281s 
 nfs2, pippin, localhost  0m23.702s  pippin, nfs2, localhost  0m22.551s 

My first thought looking at this: Well, that pretty conclusively rules out machine specific errors, or network path issues. It must be something different in the setup of the user provoking the issue that my tool does not duplicate. NOTE: This is wrong-headed, of course, as: 'An absence of evidence is not evidence of absence of a problem' but was an easy trap to fall into

For every complex problem, there is a solution that is simple, neat, and wrong.

  — H. L. Mencken

For every problem there is a solution which is simple, obvious, and wrong."

  — Albert Einstein

I tossed my results at that user for their thoughts on the results, and went back to work on another issue

Later in the day, doing some thought experiments with the user, we could not pin down where to look yet. But as a team, I had him provoke the issue with his setup, while I watched the logs on the various machines through several consoles. And the error appeared, and then jumped out and tickled my eyeballs. I was watching nothing in particular, until I saw the failure on process 29673, and then traced that back up. A successful and a failed session looked like this, respectively:

distccd[29673] (dcc_check_client)connection from :ffff:
distccd[29673] (dcc_r_file_timed)909179 bytes received in 0.078651s, rate 11289 kB/s
distccd[29627] (dcc_collect_child) cc times: user 1.132070s, system 0.144009s, 23039 minflt, 0 majflt
distccd[29673] (dcc_collect_child) cc times: user 1.092068s, system 0.104006s, 22481 minflt, 0 majflt
distccd[29673] (dcc_check_client) connection from ::ffff:
distccd[29673] (dcc_r_file_timed) 818437 bytes received in 0.071648s, rate 11155
distccd[31248](dcc_check_client)connection from ::ffff:
distccd[31248](dcc_r_file_timed)886761 bytes received in 0.076688s, rate 11292 kB/s
distccd[29627](dcc_collect_child)cc times: user 1.068066s, system 0.112007s, 23890 minflt, 0 majflt
distccd[29673](dcc_collect_child) cc times: user 1.108069s, system 0.112007s, 22012 minflt, 0 majflt
distccd[29673](dcc_pump_sendfile)Notice: sendfile: partial transmission of 15868 bytes; retrying 344332 @15868
distccd[1995] (dcc_log_child_exited)ERROR: child 29673: signal 11 (no core)

A-ha! Now we know what to look for:

dhcp-231:/var/log# grep dcc_pump_sendfile distccd-transition-log
distccd[29673] (dcc_pump_sendfile) Notice: sendfile: partial transmission of 15868 bytes; retrying 344332 @15868
distccd[31248] (dcc_pump_sendfile) Notice: sendfile: partial transmission of 15868 bytes; retrying 586732 @15868
distccd[30262] (dcc_pump_sendfile) Notice: sendfile: partial transmission of 15868 bytes; retrying 4655916 @15868
distccd[2005] (dcc_pump_sendfile) Notice: sendfile: partial transmission of 16384 bytes; retrying 74824 @16384
distccd[2128] (dcc_pump_sendfile) Notice: sendfile: partial transmission of 16384 bytes; retrying 286560 @16384
distccd[2170] (dcc_pump_sendfile) Notice: sendfile: partial transmission of 16384 bytes; retrying 97440 @16384
distccd[2129] (dcc_pump_sendfile) Notice: sendfile: partial transmission of 16384 bytes; retrying 301000 @16384

The TCP process of shuttling code to compile, and the binary results of such compiles are failing the same way, over and over again: partial transmission of 15868 bytes is present every time. Looking at the log entry again, the form of the connecting hosts is unusual: ::ffff: and ::ffff: Why that is IPv6 notation? And I reach back to my logs as I remember I had an issue like this a year or so on a Debian box

And so, Google with the search argument: debian ipv6 distcc confirms as its first result: 1. #481951 - distcc: zeroconf support broken wrt IPv6 - Debian Bug ... ... and the bug is still open. Killing off IPv6 is the obvious next step, and so, back to Google with: debian disable IPv6 to find: Disabling IPv6 under a 2.6 kernel. Reading the post, there is some back and forth, and the answer seems to be, there is not an 'official Debian answer', but this is what people are doing. Back to Google with: site:debian.org debian disable IPv6 seems to confirm that there is not a single well documented answer which has floated up in Google's searching

Compare: CentOS addresses the matter directly, and as the first Google hit with: site:centos.org disable IPv6
7. How do I disable IPv6?

* Edit /etc/sysconfig/network and set "NETWORKING_IPV6" to "no"
* Add the following to /etc/modprobe.conf :

alias ipv6 off
alias net-pf-10 off

* Run chkconfig ip6tables off to disable the IPv6 firewall
* Reboot the system

Alternative (which might be easier and works on any release with /etc/modprobe.d):
echo "install ipv6 /bin/true" > /etc/modprobe.d/disable-ipv6

Sadly, there is something else on Debian testing in play as well, and it is not just an IPv6 issue (although turning off IPv6 has drastically reduced the frequency of the issue). When I look in today to make sure the 'fix' is working

[74988.951989] distccd[8671]: segfault at 1 ip 7fdd2250e030 sp 7fff2b025da8 error 4 in libc-2.7.so[7fdd22493000+14a000]
[74989.017836] distccd[8651]: segfault at 1 ip 7fdd2250e030 sp 7fff2b025da8 error 4 in libc-2.7.so[7fdd22493000+14a000]
[74989.518050] distccd[8664]: segfault at 1 ip 7fdd2250e030 sp 7fff2b025da8 error 4 in libc-2.7.so[7fdd22493000+14a000]
[74994.152461] distccd[8659]: segfault at 1 ip 7fdd2250e030 sp 7fff2b025da8 error 4 in libc-2.7.so[7fdd22493000+14a000]

Where is that coffee cup? I knew this would not end welldomo eating a kitten

"It's different, this time"

Winston Smith

The British born, formerly American investment manager, Sir John Templeton, is attributed the following as to his craft:

The four most dangerous words in investing are 'This time it's different.'

I suspect the quip is over-constrained in limiting it to just investing. But I am meditating about another Briton's work

At last night's COLUG meeting, the presenter addressed the emergence of the latest round of internet based 'social networking' applications: twitter, facebook, blogging, multi-features personal information devices (cell phones, Blackberries, iTouchs, digital cameras and the like). I say latest round, because the assertion was made that: "Terrorists have never used photo reconnaissance" and contrarian I suggested that the people of Dresden might have a different point of view

The takeaway from the matter had to be a thoughtful person needs to be mindful of the obvious and non-obvious implications of these new technologies

The ability to build a 'mosaic' image of a person, from their public 'internet persona' is only getting easier, and more accessible to a wider audience of potential prying eyes. What once required the resources of a government or major multi-national corporation to 'dig out' are perhaps thoughtlessly revealed with all good intention. See, e.g., the 'Sarah' PSA: ("Online Sexual Exploitation - Everyone Knows Your Name"), which ends with the outline: "... so think before you post"

But the information leakage is much broader than that already, and at this point not controllable by any individual. When a member of a 'private' or 'backwater' mailing list uses GMail to subscribe, every poster suddenly is added to Google's indexing corpus; when someone at a local meeting snaps a cell phone picture and posts it publicly, it feeds the automated identification algorithms publicly known (Google's Photo), and otherwise (Think: the Tampa Bay Super Bowl photo identification effort of the crowd). Note the date of the Register article just cited: 7th February 2001. This was no Bush-ian crypto facist over-reaction to the 9/11 hijackings

During the presentation last night, the first advert link offered was for anti-aging patent drugs, along side the meeting photo (full of several grey haired and bald male persons; the second link was of 'Valerie Bertinelli -- Bikini Babe!' and had a weight loss advert in the 'doubleclick' advert box on the top right; but our presenter is interested in and follows a television show 'The Biggest Loser' and is browsing weight control related sites and mailing lists. A third, rather personal example from the presenter's prior experience completed the circle to make it clear that Google's advert engine is reading every word we read or write

The first time is an occurence; the second a co-incidence; after the third, one has to stop shaving with Occam's razor as the blade has gone dull

blank advert

I took a screenshot (full-size image) of what I am offered as to Valerie, and you'll notice that the upper right panel is blank. This is because some years ago, I amended the DNS records which computers using my DNS servers are provided, to return '' for all of 'doubleclick.net'

[root@xps400 conf]# grep -i doubleclick *.conf
NULLROUTE.conf: ad.doubleclick.net.
[root@xps400 conf]#

Adding that value (which causes the request for an advert to never reach the central advert monitoring and image feeding servers), and several more was part of a campaign for a corporate client I was consulting for at the time. The Windows 98 desktop computers which were issued to the staff did not have effective software installation access controls, to preventing addition of random malware and time wasters. Memos and meetings had not stopped the practice of a staffer downloading, say, Yahoo! Instant Messenger, and showing all her friends in that department how to do the same. Bandwidth exhaustion was becoming an issue; I assume that management also had some thoughts about lost productivity

As a technical fix the IS department was asked to remove it when found (done, but not persistent without effective access controls), and asked again. I was escalated in, and went to work with tcpdump

It turns out that the software designers at Yahoo knew their craft well. From memory, it first tried the universal Firewall Transversal Protocol (http), and then secure http and FTP

I blocked each new approach in turn. It fell back to nntp, and as I recall ntp. I do not recall that it tried to use dns content tunneling, but I certainly would have. The eventual solution had both port blocking and domain blacklisting

There is nothing new, nor indeed to my thinking, wrong for the owner of an asset to seek to profit maximize with it. But I think my thoughts and my words are my property, and on occasion on a 'think piece', I'll add the copyright reminder tag

.-- -... ---.. ... -.- -.--
Copyright (C) 2009 R P Herrold
My words are not deathless prose,
but they are mine.
Number 6

I also hold to the quaint notion that I am not a number, but an individual and the property of no one but my God. Silly, I know, but there you are

edit: typo fix

25 March 2009

People do go both ways

Scarecrow: people do go both waysThere is a scene depicted in the movie: 'Battle of the Bulge' (1965) about the 1944 attempted German breakout offensive through the Ardennes, where German commandos are tasked with and shown changing road signs to confuse Allied troops

When I started this blog, it was in response to a desire to make the CentOS internals a little more transparent to interested observers. We at the project do get the questions, and I think a thoughtful reader can pull connections from the little stories and examples I choose from the full breadth of the blog. While I might 'tag' something specifically 'CentOS', real life has no such natural boundaries, and these are just guide markers in the channel of life. confused highway sign

I added the blog into the CentOS aggregator at planet.centos.org, and set to writing. I cribbed the configs from an example of another CentOS member. I tried then to restrict the feed to the 'CentOS' label, but following the documentation just did not work. I settled for the default full feed, and resolved to solve the revisit the matter later

My friend toracat gently reminded me of the need to finish the job, this morning. Sigh ... back to wrestling markup

The example follows [there are annoying line breaks in the blog layout as rendered, and indeed in the doco upstream that need to be pasted back together, mentally]. Can you spot the error?

Full site feed:
  • Atom 1.0: http://blogname.blogspot.com/feeds/posts/default
  • RSS 2.0: http://blogname.blogspot.com/feed/post/default?alt=rss
Label-specific site feed:
  • Atom 1.0: http://blogname.blogspot.com/feeds/comments/default/-/labelname
  • RSS 2.0: http://blogname.blogspot.com/feeds/comments/default?alt=rss/-/labelname
Individual post comment feed:
  • Atom 1.0: http://blogname.blogspot.com/feeds/postId/comments/default
  • RSS 2.0: http://blogname.blogspot.com/feeds/postId/comments/default?alt=rss

There is the obvious need to s/comments/posts/g, but more is needed. I am accustomed to 'magic CGI directories' that accept variables. I use them myself. See, e.g., the expanded URL to the thumbnail of Mothra which is not just an image, but the filename, and a link to the full size one. No express CGI script is called out, as the index file for that directory is actually a smart CGI script


Enough clues, and on to the answer. I put bit of text around the answer so your eyes do not pick it out. The text at the fourth bullet above is malformed ... the part following: alt=rss needed to precede the question mark marker that identified the start of variables to the CGI script. We move before it the part: /-/labelname and add the desired label. Now a custom subfeed chosen by label is properly specified

But there are no road signs on the Blogger provided doco page to permit easily reporting errors, so that they might have be fixed

NutsAnthony McAuliffe

23 March 2009

No relation

Separated at birth?
We get questions, asking about the use of the 'orc' moniker, in IRC and at this blog.
  • Tolkein inspired?
  • World of Warcraft?
  • some older mythology?
  • None of the above
Nothing so derivative.

When first using IRC, the Freenode 'nickserv' wanted a userid not in current use, and the 'usual suspects I prefer to use were long locked up; so: orc_orc and the related variants I use. The 'Blogger' software added a later constraint to the DNS character set, causing the drop of the Underscore to form a valid domain name.

This pair of latecomers pictured above may be related to one another, but I have to disclaim any connection.

20 March 2009

Every step you take ...

a completely trackable and traceable survey tool
I received the above email [which I converted to a maskable image], with embedded web link, seeking market research data. I have masked the full URL, to prevent 'ballot box stuffing' and to protect my privacy

Now in doing good statistical sampling, customarily one assures the recipient / respondent that the responses are aggregated, and that no personally identifying information is available to the researcher. This is done to foster truthfulness and frankness from people responding to the survey, by reassuring them that no information leaks, say back to the entity covered by the survey can tie particular positive or negative 'pull comments' to a specific person
Other survey research techniques use 'calibration' questions, repeated in slightly varying form a couple of times in the survey, to make sure the respondent is actually reading the questions, is answering consistently, matches the 'shaped sample' desired demographic, and similar concerns

Here, I am solemnly (or perhaps, cheerfully) assured:
We will also gladly share the aggregate results of the survey with you, as it may be of interest to you.

All responses will remain anonymous and confidential.

What is does not say is that the author is not planning to use the data for selling 'individual drill down' detail by respondent

The sender is sort of aware of this, or perhaps it is just a boilerplate footer from SurveyMonkey:

This link is uniquely tied to this survey and your email address, please do not forward this message.
I think I will pass on this one. Time for more coffee

Revised to lay better in the top table 20 march 2009

18 March 2009

Caveats and Disclaimers

fine print This is a bit of housekeeping about this blog -- the boilerplate so to speak. I mentioned the need to do it, so here it is

I am an economist, duly trained both in academia, and in that broader school of life. I am a 'rough around the edges' statistician. I have been coding since before formal exposure to either of those disciplines. I am a mathematician. None of these pursuits carry formal certifications relevant here.

I am a lawyer, trained at a top ten school, long ago and far away, it seems.

---------------start disclaimer-------------------
I_A_AL, but not your lawyer. I offer legal advice and formal
opinion only within the confines of a previously established
and explicit attorney-client relationship where privilege may
be had; and NEVER on a public list server.
----------------end disclaimers ------------------

I may own positions from time to time in entities mentioned, and while I will try to flag such, obviously times and holdings change, and I'll not be updating such enumerations. I am NOT your investment adviser, not licensed as such, offer merely opinion which I may or may not advocate (an economist and lawyer can and should be ready to argue any side of an issue; sort of like 'high school debate club', but no holds barred) and do not render any advice or recommendation as to such matters

And this fun one:
"This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold with the understanding that the publisher is not engaged in rendering legal, tax, accounting, or other professional service. If legal advice or other expert assistance is required, the services of a competent professional person should be sought."

-- from a Declaration of Principles jointly adopted by
a Committee of the American Bar Association and a
Committee of Publishers and Associations.
Generally, the appearance of trademarks or registered trademarks
within this blog are done as a nominative and factual matter, as
and for description and identification.
See, generally, 15 USC 1115(b)(4).

I am in no wise interested in any implied trademark
infringement or counterfeiting (11 USC 1114(1)); false
designation or unfair competition (15 USC 1125(a));
dilution (15 USC 1125(c); common law infringement or
unfair competition, or dilution; violation of business
practice law or regulation as to use of marks.

No patents are knowingly infringed, nor 'trade secret' or NDA matter disclosed

The photos used are Creative Commons licensed, or otherwise under a copyright I have proper access for reproduction.

Please respect my copyright

No electrons are harmed permanently in the production of my blog content, although several get quite annoyed

17 March 2009

I saw mommy, kissing Santa ...

Santa and mommy
I can see her lying back in a satin dress
In a room where you do what you don't confess
I could picture every move that a man could make
Getting lost in her loving is your first mistake

   -- Sundown, Gordon Lightfoot

It is always kind of a sad moment, watching a younger idealist encounter something that tears asunder their old mental model, and puts them on the path to being a battered, old, steel eyed mercenary. But with that loss of innocence, new doors open

One useful paradigm to look at the consumers of Enterprise *nix software is to break them into a partition of three major types:
  1. Those that Have to have the 'Real McCoy', possibly for 'CYA' purposes, or because a upstream vendor says that they need the 'real' one as part of the 'silo' they will support without extra charge (if at all) to meet a performance SLA
  2. Those who do not have a strong mandate, but are generally willing to pay the minimal incremental cost such a subscription adds to their bundle of functions, and
  3. Those who will simply not pay for 'free' software: No how, no way; no, sir
The commercial enterprise Linux' have been generally successful in 'cannibal conversions' of enterprise consumers of 'olde skoole' proprietary Unix -- The morning's news has rumor that IBM is sniffing around JAVA. We covered the topic, and Ted T'so's proto-quant thought piece on this [Ted being on leave of absence from IBM to the Linux Foundation, as I recall] some months back, in the context of the future for software freedom

All the young idealists from the BSD side of the FOSS house saw their holdings of SUNW eroded away in recent years, the progressive shifts away from hardware, away from ksh v. csh language debates, into the tangled place of license issue and re-inventions of approaches on scaling, as their firm flailed with JAVA [v. rather than use the one true type safe modern OO language, c++ // sorry, could not resist], into databases with a product that will NEVER be Oracle DB, no matter how hard it tries

JAVA felt it had to move past Berkeley DB, and darn it, all the cool kids use SQL. ORCL is the only credible lead player in database space (IBM and DB2 are there of course, but databases are rounding error to IBM's financial statement). JAVA never could articulate the unique value proposition that picking up MySQL, AB, brought to the table, and let the acquisition languish, perhaps hoping that the database's engine in the 'LAMP' stack would pull in tier 2 conversion sales (see the next part, infra). I think they have pretty well demonstrated that "hope" is not a business strategy to follow

Then there is that second tier -- FOSS *nix in through the side door, and without formal support contracts at first. "Under the Radar", so to speak. [Note: The linked article is a bit 'snarky' about Bob's new venture: Lulu, but I find it a wonderful and reliable service, to convert 'print pre-flighted' PDF's to bound books, for cheap, fast and reliably. Highly recommended.]

Just as I might choose to burn up a laser printer to print a manual, and do home-brew binding, Lulu has found a value proposition that makes me 'buy' their service, rather than 'build' it myself. They have convinced me that outsourcing my printing to tier 1 is the 'right' decision. He has converted me to producing wonderful documents from TeX that his business handles the ink to paper, binding and delivery parts. It seems Bob is also 'whiteboxing' short run, 'just in time' print of conference manuals, and continuing education materials. A nice niche, but low barriers to entry

And then there are the 'No how, no way' school in tier three. This recent post in the CentOS forums, "leasing CentOS5 from DataCenter", caught my eye:
Recently we had a customer come to us asking how much we lease out CentOS for.
I thought this was an odd question - since CentOS is ... FREE

When in dialogue with them I learned they have a number of servers with a different provider that charges them $5.00 per month for the Operating system.

I thought this was a bit strange - and wondered - Is it even legal?

How can a datacenter lease out something that is free?
I could understand perhaps charging a setup fee based upon a customers requirements - this is a service --- but
for a datacenter to live off of the backs of someone else by charging for something that is free -

it just bugs me and rubbed me the wrong way -

Any thoughts - ?

Not sure why it bugged me so much - perhaps its because we write a ton of opensource software and could not imagine someone charging for the software itself.

Support / Installation / Service yes - but the software ... i thought thats what GPL protected folks from
This poster has missed the point of the GPL so widely, it is painful.

The GPL is perfectly fine with charging for software which requires that it be accompanied with an offer of access to the sources it was built from. This is what builds markets, and indeed, what makes CentOS possible in part. CentOS is fine with a redistribution and commercialization, so long as our marks and brand are not mis-represented. [Advert: The CentOS project would put a 'tithe' of that rental to good use -- money, machines, bandwidth, and so forth, but it is not mandatory.] A better question might be: Is the data center that employs that poster itself providing the GPL required offer of sources access, and meeting its duty to provide, when they provide binaries under 'lease'?

Someone may well come along and undercut a person selling GPL and related FOSS licensed software for less. I wrote a post encouraging people who 'cannot wait' for the CentOS 5.3 respin, or the updates which get stacked up, waiting for that stabilization process to end, encouraging them to 'outcompete' CentOS. I am fine with that. I know it won't happen generally [Scientific Linux is the closest credible 'fellow traveller' remaining on this highway; Hi, Connie and Troy] soon, as it is non-trivial to ship and support the full line product

The protection of the Four Freedoms under the GPL makes it inevitable that someone will make a run at commercializing FOSS; this is a 'Good Thing'. But then the trick is to provide value; that is, also provide design services, consulting, 'service after the sale', or build a support infrastructure, to make it safe to entrust one's most valuable assets to that software. I feel CentOS meets that test in the 95% case for tier 2; others may dial that number up or down, and do according to their risk tolerance

And with that, we are back to my post sending people with an external factor 'beating on them' about SLA's, to: Go Buy from CentOS' Upstream

Disclaimer: I hold direct positions in JAVA (minimal to get keep skin in the game, and to remind me to follow it) and ORCL, and have held IBM in my past; I regularly quote against IBM as to providing third-party *nix support services. I probably need to write a Caveats and Disclaimers post

12 March 2009

Embarrassingly parallel

Bruce Schneier, in his 'Crypto-gram' summary this month, has an outlink to a story in The Register on a purported desire of the US NSA to crack Skype's call crypto

But this misses the point -- the needed technology and infra-structure are out there already, fielded, and ready to go, pretty everywhere. Let's take a hypothetical country -- call it 'Glassware' ("US", "China" and "Elbonia" were taken)

The country of Glassware has a population of M * 10 ^ N people

Of those M * 10 ^ N people, the average family size is three, and there are an average of two cell phones and one television (the latest -- digital)

There is a broadcast infrastructure suitable to distributing portions of a problem sample -- say, the header block -- sufficiently long that one can detect when a 'good' private key has been found, which is sufficient to decode something encoded with an asymmetric encoding public key.

That target information is distributed over the airwaves, in the vertical blanking interval or sub-carrier side layer, itself encoded with a private key, readily decodeable with one of several 'factory included 'public keys'

The power supply switches in the television sets do not actually place the sets into a 'No power drawn' mode -- just into a lower power use 'sleep' stand-by mode. When tickled with the right signal, and not otherwise engaged in presenting content to possessors of that unit (who might complain about glitches if the video graphics display processor did not fully paint their screen), it is possible to wake them up to do some ciphering. Good for them -- recycles the electrons, and so forth

The television has a handy feature -- it will accept and display caller ID information from nearby affiliated cellular phones, over BlueTooth -- it can be configured to ONLY display wanted cell phones, but it will receive data and collate data from all ringing near it.

So when Mrs Glassware has her girlfriends over, and the babysitter calls during the home sales party, the TV will pop up an alert for them of the call over the din of the fun.

The TV also sends back, over SMS messages, duly encoded and encrypted, the logfiles to series of central collation points -- Father Glassware can see when the oldest son is over at the home of the girl from the wrong side of the tracks. The benefits are as broad as the imagination can see. Who could be against protecting the children?

Those cell phones as it turns out are really not using very much of all that processing power they have in THEIR 'CUDA chips to draw those dinky screens, and are really off most of the time as well.

Let's not waste their graphics processor chips as well, when they are on the charger. This is great, as it simplifies the math.

Perhaps Glassware have an even better infrastructure -- say a national conversion to High Definition digital media signaling, and a mature broadband or cable modem backbone. All the better for shuttling information around digitally.

A friend who deals with quants, tells me the quants are all hot and bothered to get 4 x quad head graphics cards in Dell Precision units -- 16 GPU's, because each of them can do a 10,000 (10 ^ 4) speedup over the simple general purpose processors in the underlying processors the chassis carry. All for under $10k a unit. They are doing the math and think they can have a huge HPC farm, just in the normal overhead which their traders and developers have to have anyway to do their day jobs.

M is 3 in the US (we'll round to 4 to make the math prettier), and perhaps 10 in China, and N is 8 (a hundred million). Feel free to pick a value for your local Glassware

So properly harnessed, we have at least: M * (10 ^ N) * (10 ^ 4) in compute engines available to us -- we should be able to crank out at least 100,000 samples a second ... 10 ^ 5, In cough numbers -- sufficiently accurate for our 'back of the envelope' purposes here, 10 is equal to 2 ^ 3. 2 is useful, as it is bits of key strength to solve. There are 8.6 * 10 ^ 4 seconds in a day -- call it 2 ^ 16

so: M * 2 ^ (3 + 3 + 3 + N + 4 + 5+ 16)

US: 2 ^ 43 key trials per day;
China: 2 ^ 44 key trials per day.

The old DES cipher had a 2 ^ 56 bit keyspace -- worst case time to solution is 2 ^ 13 days and always getting better as build out scales in, without even beginning to bear pre-processing tricks, One time pad reuse, identifying non-perfect implementations, planting known cribs, and the rest.

And it is Free, free, free -- or better yet, paid for by others. What was that old saw about people living in glass houses?